Fundamentals of speech synthesis and speech recognition: basic concepts, state of the art and future challenges
Gespeichert in:
Weitere Verfasser: | |
---|---|
Format: | Buch |
Sprache: | Undetermined |
Veröffentlicht: |
Chichester <<[u.a.]>>
Wiley
1995
|
Ausgabe: | Repr. |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XIII, 379 S. |
ISBN: | 0471944491 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV026180552 | ||
003 | DE-604 | ||
005 | 20110228 | ||
007 | t | ||
008 | 110326r1995uuuu |||| 00||| und d | ||
020 | |a 0471944491 |9 0-471-94449-1 | ||
035 | |a (OCoLC)174338836 | ||
035 | |a (DE-599)BVBBV026180552 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | |a und | ||
049 | |a DE-188 | ||
082 | 0 | |a 006.454 | |
084 | |a CQ 4000 |0 (DE-625)19006: |2 rvk | ||
084 | |a ES 945 |0 (DE-625)27935: |2 rvk | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
245 | 1 | 0 | |a Fundamentals of speech synthesis and speech recognition |b basic concepts, state of the art and future challenges |c ed. by Eric Keller |
250 | |a Repr. | ||
264 | 1 | |a Chichester <<[u.a.]>> |b Wiley |c 1995 | |
300 | |a XIII, 379 S. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4143413-4 |a Aufsatzsammlung |2 gnd-content | |
689 | 0 | 0 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |D s |
689 | 1 | |5 DE-604 | |
700 | 1 | |a Keller, Eric |4 edt | |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021765109&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-021765109 |
Datensatz im Suchindex
_version_ | 1804144636072034304 |
---|---|
adam_text | Fundamentals of Speech
Synthesis and Speech
Recognition
Basic Concepts, State of the Art
and Future Challenges
Edited by
Eric KeUer
University of Lausanne, Switzerland
JOHN WILEY amp; SONS
Chichester • New York • Brisbane • Toronto • Singapore
Contents
Preface xi
SECTION 1 BACKGROUND 1
E Keller and J Caelen
1 Fundamentals of Phonetic Science 5
E Keller
The Communication Process: Transmission Despite a Noisy Line S
A Capsule Summary of Speech Articulation 9
Speech as an Integrated String of Speech Sounds 14
Speech Signal Analysis Techniques 17
Conclusion 20
References 20
2 Prosodic Aspects of Speech 23
S Werner and E Keller
Prosodic Phenomena in Detail 26
Prosodic Universals and Language-Specific Differences 33
The Relevance of Prosody for Speech Synthesis and Speech Recognition 36
Conclusion: Some Terminological Remarks and Relationships to Other Linguistic
Domains 37
References 39
3 Pauses and the Temporal Structure of Speech 41
B Zellner
Pauses 42
The Durational Structure of Speech 49
A Parsing Tool: The Performance Structures SO
The Automatic Generation of Durations: Algorithms 55
Statistical Systems 38
Conclusion 60
Contents
The Kellner-Zellner Algorithm 60
References 61
SECTION 2 STATE OF THE ART 63
E Keller and J Caelen
4 Subphonemic Segment Inventories for Concatenative Speech
Synthesis 69
P Bhaskararo
Types of Speech Synthesis 69
Segments for Concatenative Synthesis 70
Preparation of a Subphonemic Segment Inventory 74
Conclusion 82
References 83
5 Text-to-Speech Synthesis: An Introduction and A Case Study 87
B Pfister and C Traber
Major Problems and Possible Solutions 88
A Case Study: The SVOX TTS System for German 93
Conclusion 105
References 10S
6 Formant Synthesis 109
T Styger and E Keller
An Overview of Synthesis Techniques 109
Source-Filter Model 113
Example of a Formant Synthesiser 120
Conclusion 125
References 126
Annex: Terminal Analog Synthesiser 127
7 Automatic Speech and Speaker Recognition: Overview, Current Issues and
Perspectives 129
G Chollet
What are the Challenges in ASR? 131
The Place of ASR within Speech Technology 132
Phonetics and ASR 133
Speech Analysis Techniques in ASR 134
The Variability of Acoustic Parameters and the Problem of Speaker Recognition 138
Speech Recognition 139
Evaluation 142
Conclusions and Further Outlook 142
References and Further Reading 143
Other Sources of References 146
Contents vii
8 Stochastic Models and Artificial Neural Networks for Automatic Speech
Recognition 149
K Torkkola
Knowledge-Based vs Data-Based Approaches 149
Stochastic Models for Speech Recognition I SO
Artificial Neural Networks for Speech Recognition 139
Hybrid Methods for Speech Recognition 164
Conclusion 166
Acknowledgements 166
References 166
SECTION 3 CHALLENGES 171
E Keller and J Caelen
9 The Prediction of Vowel Systems: Perceptual Contrast and Stability 185
L-J Boe, J -L Schwartz andN Vallee
The Framework 187
Strategy 190
The Model 193
Results 198
Discussion 202
Conclusion 209
References 210
10 Articulatory Models in Speech Synthesis 215
B Gabioud
The Need for Articulatory Models 216
A Chain of Models 217
A Convergence of Ideas 218
Maeda s Statistical Model 221
Trends for the Future 224
Conclusions 228
References 229
11 Dynamic Modelling and Control of Speech Articulators:
Application to Vowel Reduction 231
P Perrier and D J Ostry
Dynamic Properties of Speech Articulators 233
Control of Speech Articulators 238
The Equilibrium-Point Hypothesis and Vowel Reduction 244
Conclusion 247
References 249
Contents
12 Phonological Structure, Parametric Phonetic Interpretation and
Natural-Sounding Synthesis 253
J Local
Synthesis by Rule: The Standard Model 254
Non-Segmental Phonological Representation 256
Firthian Non-Segmental Synthesis: The YorkTalk Model 258
Conclusion 268
References 268
13 Semantic and Pragmatic Prediction of Prosodic Structures 271
G Caelen-Haumont
Relations between Syntax, Semantics and Prosody in Previous Research 272
Functional Syntax and the Semantic Approach 273
Which Kind of Predictive Model? 276
Hypotheses and Experimental Method 276
Linguistic Models and Prediction 277
Analysis Method and Results 284
Conclusion 290
Acknowledgements 290
References 290
14 Separating Simultaneous Sound Sources: Issues, Challenges and
Models 297
M Cooke and G J Brown
Gestalt Principles of Perceptual Organisation and Auditory Scene Analysis 298
Representations for Computational Auditory Scene Analysis 299
Challenges for Models of Auditory Grouping 302
Strategies for Auditory Scene Exploration 305
New Directions 309
References 310
15 Auditory Computations that Separate Speech from Competing Sounds:
A Comparison of Monaural and Binaural Processes 313
Q Summerfield and J F Culling
Use of Differences in Fundamental Frequency to Separate the Voiced Speech of
Concurrent Talkers 315
Use of Inter-aural Timing Cues to Separate Speech from Noise 326
Discussion 334
Comparison of Monaural and Binaural Segregation 335
Acknowledgements 336
References 336
Contents
16 Multimodal Human-Computer Interface 339
J Caelen
Elements of Ergonomics 341
Language-based Modes 343
Non-language-based Modes 344
Mode Adequacy 346
The Problems of Multimodal HCI 347
Mode Management 347
The Fusion and the Fission of Information 355
ICPdraw: An Example of a Multimodal HCI 363
Conclusion 370
Acknowledgements 371
References 371
Index 375
|
any_adam_object | 1 |
author2 | Keller, Eric |
author2_role | edt |
author2_variant | e k ek |
author_facet | Keller, Eric |
building | Verbundindex |
bvnumber | BV026180552 |
classification_rvk | CQ 4000 ES 945 ST 306 |
ctrlnum | (OCoLC)174338836 (DE-599)BVBBV026180552 |
dewey-full | 006.454 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.454 |
dewey-search | 006.454 |
dewey-sort | 16.454 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik Sprachwissenschaft Psychologie Literaturwissenschaft |
edition | Repr. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01572nam a2200409 c 4500</leader><controlfield tag="001">BV026180552</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20110228 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">110326r1995uuuu |||| 00||| und d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0471944491</subfield><subfield code="9">0-471-94449-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)174338836</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV026180552</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">und</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-188</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.454</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">CQ 4000</subfield><subfield code="0">(DE-625)19006:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 945</subfield><subfield code="0">(DE-625)27935:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Fundamentals of speech synthesis and speech recognition</subfield><subfield code="b">basic concepts, state of the art and future challenges</subfield><subfield code="c">ed. by Eric Keller</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Repr.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Chichester <<[u.a.]>></subfield><subfield code="b">Wiley</subfield><subfield code="c">1995</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIII, 379 S.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4143413-4</subfield><subfield code="a">Aufsatzsammlung</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Keller, Eric</subfield><subfield code="4">edt</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021765109&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-021765109</subfield></datafield></record></collection> |
genre | (DE-588)4143413-4 Aufsatzsammlung gnd-content |
genre_facet | Aufsatzsammlung |
id | DE-604.BV026180552 |
illustrated | Not Illustrated |
indexdate | 2024-07-09T23:06:42Z |
institution | BVB |
isbn | 0471944491 |
language | Undetermined |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-021765109 |
oclc_num | 174338836 |
open_access_boolean | |
owner | DE-188 |
owner_facet | DE-188 |
physical | XIII, 379 S. |
publishDate | 1995 |
publishDateSearch | 1995 |
publishDateSort | 1995 |
publisher | Wiley |
record_format | marc |
spelling | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges ed. by Eric Keller Repr. Chichester <<[u.a.]>> Wiley 1995 XIII, 379 S. txt rdacontent n rdamedia nc rdacarrier Sprachverarbeitung (DE-588)4116579-2 gnd rswk-swf Automatische Spracherkennung (DE-588)4003961-4 gnd rswk-swf (DE-588)4143413-4 Aufsatzsammlung gnd-content Automatische Spracherkennung (DE-588)4003961-4 s DE-604 Sprachverarbeitung (DE-588)4116579-2 s Keller, Eric edt HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021765109&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges Sprachverarbeitung (DE-588)4116579-2 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd |
subject_GND | (DE-588)4116579-2 (DE-588)4003961-4 (DE-588)4143413-4 |
title | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges |
title_auth | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges |
title_exact_search | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges |
title_full | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges ed. by Eric Keller |
title_fullStr | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges ed. by Eric Keller |
title_full_unstemmed | Fundamentals of speech synthesis and speech recognition basic concepts, state of the art and future challenges ed. by Eric Keller |
title_short | Fundamentals of speech synthesis and speech recognition |
title_sort | fundamentals of speech synthesis and speech recognition basic concepts state of the art and future challenges |
title_sub | basic concepts, state of the art and future challenges |
topic | Sprachverarbeitung (DE-588)4116579-2 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd |
topic_facet | Sprachverarbeitung Automatische Spracherkennung Aufsatzsammlung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021765109&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT kellereric fundamentalsofspeechsynthesisandspeechrecognitionbasicconceptsstateoftheartandfuturechallenges |