From text to speech: the MITalk system
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cambridge u.a.
Cambridge Univ. Pr.
1987
|
Ausgabe: | 1. publ. |
Schriftenreihe: | Cambridge studies in speech science and communication.
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Literaturverz. S. 207 - 215 |
Beschreibung: | XI, 216 S. graph. Darst. |
ISBN: | 0521306418 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV002060842 | ||
003 | DE-604 | ||
005 | 20040315 | ||
007 | t | ||
008 | 890928s1987 d||| |||| 00||| eng d | ||
020 | |a 0521306418 |9 0-521-30641-8 | ||
035 | |a (OCoLC)12668869 | ||
035 | |a (DE-599)BVBBV002060842 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-91 |a DE-473 |a DE-824 |a DE-29 |a DE-739 |a DE-355 |a DE-19 |a DE-83 |a DE-188 | ||
050 | 0 | |a TK7882.S65 | |
082 | 0 | |a 006.5 |2 19 | |
084 | |a ES 945 |0 (DE-625)27935: |2 rvk | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
084 | |a ELT 533f |2 stub | ||
100 | 1 | |a Allen, Jonathan |e Verfasser |4 aut | |
245 | 1 | 0 | |a From text to speech |b the MITalk system |c Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt |
250 | |a 1. publ. | ||
264 | 1 | |a Cambridge u.a. |b Cambridge Univ. Pr. |c 1987 | |
300 | |a XI, 216 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Cambridge studies in speech science and communication. | |
500 | |a Literaturverz. S. 207 - 215 | ||
650 | 4 | |a Parole - Synthèse | |
650 | 4 | |a Parole, Systèmes de traitement de la | |
650 | 4 | |a Automatic Data Processing | |
650 | 4 | |a Communication | |
650 | 4 | |a Speech | |
650 | 4 | |a Speech Perception | |
650 | 4 | |a Speech processing systems | |
650 | 4 | |a Speech synthesis | |
650 | 4 | |a Text files | |
650 | 0 | 7 | |a Textverarbeitung |0 (DE-588)4059667-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |D s |
689 | 0 | 1 | |a Textverarbeitung |0 (DE-588)4059667-9 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Hunnicutt, M. Sharon |e Verfasser |4 aut | |
700 | 1 | |a Klatt, Dennis |e Verfasser |4 aut | |
856 | 4 | 2 | |m HBZ Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=001347768&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-001347768 |
Datensatz im Suchindex
_version_ | 1814427036454748160 |
---|---|
adam_text |
Contents
List of contributors xi
Preface 1
1 Introduction 7
1.1 Constraints on speech synthesis 7
1.2 Synthesis techniques 9
1.3 Functional outline of MITalk 12
1 Analysis
2 Text preprocessing 16
2.1 Overview 16
2.2 Input 17
2.3 Output 18
2.4 Formatting operations 18
3 Morphological analysis 23
3.1 Overview 23
3.2 Input 27
3.3 Output 27
3.4 The algorithm 28
3.5 An example of a decomposition 35
3.6 The lexicon 36
4 The phrase-level parser 40
4.1 Overview 40
4.2 Input 41
4.3 Output 41
4.4 Parts of speech 41
4.5 The part-of-speech processor 43
4.6 The parser algorithm 45
4.7 Some examples 51
5 Morphophonemics and stress adjustment 52
5.1 Overview 52
5.2 Input 52
5.3 Output 52
5.4 Morphophonemic rules 52
5.5 Stress modification rules 54
5.6 An example 54
6 Letter-to-sound and lexical stress 57
6.1 Overview 57
6.2 Letter-to-sound 57
6.3 Lexical stress placement 61
6.4 An example 69
V
Contents
II Synthesis
7 Survey of speech synthesis technology 71
7.1 Overview 71
7.2 Background 72
7.3 Synthesis techniques 73
7.4 Applications 79
8 The phonological component 81
8.1 Overview 81
8.2 Input representation for a sentence 81 ,
8.3 Comparison between ideal synthesis input and system performance 85
8.4 Stress rules 86
8.5 Rules of segmental phonology 87
8.6 Pauses 88
8.7 Evaluation of the analysis modules 89
9 The prosodic component 93
9.1 Overview 93
9.2 Segmental durations 93
10 The fundamental frequency generator 100
10.1 Overview 100
10.2 Input 101
10.3 Output 102 ;
10.4 The O'Shaughnessy fundamental frequency algorithm 103
10.5 Adjustments to the O'Shaughnessy algorithm 107
10.6 Potential improvements from additional syntactic information 107
11 The phonetic component 108
11.1 Overview 108
11.2 "Synthesis-by-analysis" of consonant-vowel syllables 109
11.3 General rules for the synthesis of phonetic sequences 116
11.4 Summary 122
12 The Klatt formant synthesizer 123
12.1 Overview 123
12.2 Vocal tract transfer functions 139
12.3 Radiation characteristic 150 ,
13 Some measures of intelligibility and comprehension 151
13-1 Overview 151
13-2 Phoneme recognition 152
13-3 Word recognition in sentences 157
13-4 Comprehension 161
13-5 General discussion and conclusions 167 ,
14 Implementation 172
14.1 Conceptual organization 172
14.2 Development system 173
14.3 performance system 174
14.4 UNIX implementation 174
14.5 Using the system 175 ;
vi
Contents
Appendixes
A Part-of-speech processor 177
B Klatt symbols 179
C Context-dependent rules for PHONET 181
D Sample test trials from the Modified Rhyme Test 202
E Sample test materials from the Harvard Psychoacoustic Sentences 203
F Sample test materials from the Haskins Anomalous Sentences 204
G Sample passage used to test listening comprehension 205
References 207
Index 215
List of figures
2-1 Example of FORMAT processing 18
3-1 State transition diagram for the morph sequence FSM 31
3-2 Decomposition of "scarcity" 37
4-1 Noun group ATN listing 47
4-2 Verb group ATN listing 48
4-3 ATN diagram for verb groups 49
4-4 ATN diagram for noun groups 50
4-5 Example of PARSER operation 51
5-1 Input to and output from SOUND 1 55
6-1 Suffix detection in the word finishing 58
6-2 Application of letter-to-sound rules to caribou 60
6-3 Application of letter-to-sound rules to subversion 60
6-4 Example of letter-to-sound and stress rule operation 69
7-1 Synthesis blocks of the MITalk system 72
7-2 An example of the differences between words spoken in isolation and
words spoken as a continuous utterance 74
8-1 Example of PHONO1 and PHONO2 processing 82
9-1 Example of the processing performed by PROSOD 94
10-1 Example of F0 contours 105
11-1 Spectrum analysis of a speech waveform 111
11-2 First and second formant motions in English vowels 112
11-3 Linear prediction of plosive bursts before vowels 113
11-4 Frequency of the lowest three formants measured at voicing onset for
syllables involving bb, DD.and gg 114
11-5 Synthesis strategy for a CV syllable 115
11-6 Templates for smoothing adjacent phonetic segment targets 117
11-7 Constants used to specify the inherent formant and durational
characteristics of a sonorant 120
12-1 Interface between synthesizer software and hardware 123
12-2 Components of the output spectrum of a speech sound 125
12-3 Parallel and cascade simulation of the vocal tract transfer function 126
12-4 Cascade/parallel configurations supported by MTTalk 127
12-5 Block diagram and frequency response of a digital resonator 129
12-6 B lock diagram of the cascade/parallel formant synthesizer 131
12-7 Four periods from voicing waveforms 135
12-8 Waveform segment and magnitude spectrum of frication noise 137
12-9 Magnitude of the vocal tract transfer function 141
12-10 Nasalization of the vowel ih in the syllable "dim" 143
12-11 Effect of parameter changes on the vocal tract transfer function 146
12-12 Preemphasized output spectra from cascade and parallel models 148
12-13 Spectra from two different parallel synthesis configurations 149
12-14 Transfer function of the radiation characteristic 150
13-1 Average percent errors across various manner classes 154
13-2 Distribution of errors and most frequent perceptual confusions 155
viii
List of figures
13-3 Percent correct comprehension scores for reading and listening groups 165
14-1 Sample MITalk session 176
C-l Pre-aspiration parameter smoothing 189
C-2 Diphthong transition smoothing 194
List of tables
2-1 Abbreviation translations performed by FORMAT 19
3-1 Morph spelling change rules for vocalic suffixes 36
8-1 Klatt symbols used in the synthesis modules 84
9-1 Minimum and inherent durations in msec for each segment type 96
10-1 Relative peak levels of words according to their parts of speech 101
11-1 Parameter values for the synthesis of selected vowels 119
11-2 Parameter values for the synthesis of selected components of English
consonants before front vowels 121
11-3 Variable control parameters specified in PHONET 122
12-1 List of control parameters for the software formant synthesizer 132
13-1 Characteristics of the passages used to measure comprehension 163
B-l Klatt symbols for phonetic segments 179
B-2 Klatt symbols for nonsegmental units 180
C-l Parameter targets for nonvocalic segments 186
C-2 Parameter targets for vocalic segments 187
C-3 Default values for duration of forward smoothing (Tcf) 188
C-4 Default values for Bper 188
C-5 Diphthong transition parameters 194
C-6 Duration of forward smoothing for obstruents (Tcobst) 196
C-7 Default plosive burst duration 197 |
any_adam_object | 1 |
author | Allen, Jonathan Hunnicutt, M. Sharon Klatt, Dennis |
author_facet | Allen, Jonathan Hunnicutt, M. Sharon Klatt, Dennis |
author_role | aut aut aut |
author_sort | Allen, Jonathan |
author_variant | j a ja m s h ms msh d k dk |
building | Verbundindex |
bvnumber | BV002060842 |
callnumber-first | T - Technology |
callnumber-label | TK7882 |
callnumber-raw | TK7882.S65 |
callnumber-search | TK7882.S65 |
callnumber-sort | TK 47882 S65 |
callnumber-subject | TK - Electrical and Nuclear Engineering |
classification_rvk | ES 945 ST 306 |
classification_tum | ELT 533f |
ctrlnum | (OCoLC)12668869 (DE-599)BVBBV002060842 |
dewey-full | 006.5 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.5 |
dewey-search | 006.5 |
dewey-sort | 16.5 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik Sprachwissenschaft Elektrotechnik Literaturwissenschaft |
edition | 1. publ. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV002060842</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20040315</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">890928s1987 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0521306418</subfield><subfield code="9">0-521-30641-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)12668869</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV002060842</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-824</subfield><subfield code="a">DE-29</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-188</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK7882.S65</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.5</subfield><subfield code="2">19</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 945</subfield><subfield code="0">(DE-625)27935:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ELT 533f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Allen, Jonathan</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">From text to speech</subfield><subfield code="b">the MITalk system</subfield><subfield code="c">Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. publ.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge u.a.</subfield><subfield code="b">Cambridge Univ. Pr.</subfield><subfield code="c">1987</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 216 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Cambridge studies in speech science and communication.</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturverz. S. 207 - 215</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parole - Synthèse</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parole, Systèmes de traitement de la</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Automatic Data Processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Communication</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech Perception</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech processing systems</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech synthesis</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text files</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Textverarbeitung</subfield><subfield code="0">(DE-588)4059667-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Textverarbeitung</subfield><subfield code="0">(DE-588)4059667-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hunnicutt, M. Sharon</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Klatt, Dennis</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=001347768&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-001347768</subfield></datafield></record></collection> |
id | DE-604.BV002060842 |
illustrated | Illustrated |
indexdate | 2024-10-31T11:01:03Z |
institution | BVB |
isbn | 0521306418 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-001347768 |
oclc_num | 12668869 |
open_access_boolean | |
owner | DE-91 DE-BY-TUM DE-473 DE-BY-UBG DE-824 DE-29 DE-739 DE-355 DE-BY-UBR DE-19 DE-BY-UBM DE-83 DE-188 |
owner_facet | DE-91 DE-BY-TUM DE-473 DE-BY-UBG DE-824 DE-29 DE-739 DE-355 DE-BY-UBR DE-19 DE-BY-UBM DE-83 DE-188 |
physical | XI, 216 S. graph. Darst. |
publishDate | 1987 |
publishDateSearch | 1987 |
publishDateSort | 1987 |
publisher | Cambridge Univ. Pr. |
record_format | marc |
series2 | Cambridge studies in speech science and communication. |
spelling | Allen, Jonathan Verfasser aut From text to speech the MITalk system Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt 1. publ. Cambridge u.a. Cambridge Univ. Pr. 1987 XI, 216 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Cambridge studies in speech science and communication. Literaturverz. S. 207 - 215 Parole - Synthèse Parole, Systèmes de traitement de la Automatic Data Processing Communication Speech Speech Perception Speech processing systems Speech synthesis Text files Textverarbeitung (DE-588)4059667-9 gnd rswk-swf Sprachverarbeitung (DE-588)4116579-2 gnd rswk-swf Sprachverarbeitung (DE-588)4116579-2 s Textverarbeitung (DE-588)4059667-9 s DE-604 Hunnicutt, M. Sharon Verfasser aut Klatt, Dennis Verfasser aut HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=001347768&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Allen, Jonathan Hunnicutt, M. Sharon Klatt, Dennis From text to speech the MITalk system Parole - Synthèse Parole, Systèmes de traitement de la Automatic Data Processing Communication Speech Speech Perception Speech processing systems Speech synthesis Text files Textverarbeitung (DE-588)4059667-9 gnd Sprachverarbeitung (DE-588)4116579-2 gnd |
subject_GND | (DE-588)4059667-9 (DE-588)4116579-2 |
title | From text to speech the MITalk system |
title_auth | From text to speech the MITalk system |
title_exact_search | From text to speech the MITalk system |
title_full | From text to speech the MITalk system Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt |
title_fullStr | From text to speech the MITalk system Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt |
title_full_unstemmed | From text to speech the MITalk system Jonathan Allen ; M. Sharon Hunnicutt and Dennis Klatt |
title_short | From text to speech |
title_sort | from text to speech the mitalk system |
title_sub | the MITalk system |
topic | Parole - Synthèse Parole, Systèmes de traitement de la Automatic Data Processing Communication Speech Speech Perception Speech processing systems Speech synthesis Text files Textverarbeitung (DE-588)4059667-9 gnd Sprachverarbeitung (DE-588)4116579-2 gnd |
topic_facet | Parole - Synthèse Parole, Systèmes de traitement de la Automatic Data Processing Communication Speech Speech Perception Speech processing systems Speech synthesis Text files Textverarbeitung Sprachverarbeitung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=001347768&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT allenjonathan fromtexttospeechthemitalksystem AT hunnicuttmsharon fromtexttospeechthemitalksystem AT klattdennis fromtexttospeechthemitalksystem |