Verfügbarkeit: Text to speech synthesis

Text to speech synthesis: new paradigms and advances

Gespeichert in:

Bibliographische Detailangaben
Format:	Buch
Sprache:	English
Veröffentlicht:	Upper Saddle River, N.J. Prentice Hall Professional Technical Reference 2004
Ausgabe:	1. print.
Schlagworte:	aSpeech synthesis Sprachsynthese Aufsatzsammlung
Online-Zugang:	Inhaltsverzeichnis
Beschreibung:	Includes bibliographical references and index
Beschreibung:	XXIII, 257 S. graph. Darst.
ISBN:	013145661X

Internformat

MARC


LEADER	00000nam a2200000zc 4500
001	BV019539601
003	DE-604
005	20041103
007	t
008	041102s2004 xxud\|\|\| \|\|\|\| 00\|\|\| eng d
010			\|a 2004010674
020			\|a 013145661X \|c hardcover : alk. paper \|9 0-13-145661-X
035			\|a (OCoLC)249193154
035			\|a (DE-599)BVBBV019539601
040			\|a DE-604 \|b ger \|e aacr
041	0		\|a eng
044			\|a xxu \|c US
049			\|a DE-29T
050		0	\|a TK7882.S65
082	0		\|a 621.399
245	1	0	\|a Text to speech synthesis \|b new paradigms and advances \|c [edited by] Shrikanth Narayanan, Abeer Alwan
250			\|a 1. print.
264		1	\|a Upper Saddle River, N.J. \|b Prentice Hall Professional Technical Reference \|c 2004
300			\|a XXIII, 257 S. \|b graph. Darst.
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
500			\|a Includes bibliographical references and index
650		4	\|a aSpeech synthesis
650	0	7	\|a Sprachsynthese \|0 (DE-588)4056501-4 \|2 gnd \|9 rswk-swf
655		7	\|0 (DE-588)4143413-4 \|a Aufsatzsammlung \|2 gnd-content
689	0	0	\|a Sprachsynthese \|0 (DE-588)4056501-4 \|D s
689	0		\|5 DE-604
700	1		\|a Narayanan, Shrikanth \|e Sonstige \|4 oth
700	1		\|a Alwan, Abeer \|e Sonstige \|4 oth
856	4	2	\|m HEBIS Datenaustausch \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012907966&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
999			\|a oai:aleph.bib-bvb.de:BVB01-012907966

Datensatz im Suchindex

_version_	1804132923545223168
adam_text	TEXT TO SPEECH SYNTHESIS New Paradigms and Advances Shrikanth Narayanan Abeer Alwan PRENTICE HALL PTR Prentice Hall Professional Technical Reference Upper Saddle River, New Jersey 07458 www phptr com Contents PREFACE xiii FOREWORD xvi 1 REDUCING DISCONTINUITIES AT SYNTHESIS TIME FOR CORPUS-BASED SPEECH SYNTHESIS Boris Bozkurt, Thierry Dutoit, Romain Prudon, Christophe D Alessandro, and Vincent Pagel 1 1 1 Introduction 1 1 2 Shift-Only FO Smoothing 2 121 Where to Apply Pitch Shifts 3 122 Calculating Shifts to Be Introduced 4 123 Preliminary Listening Tests for Shift-Only FO Smoothing 6 1 3 Improving Quality of MBROLA Synthesis 8 131 Background 8 132 TP-MBROLA Synthesis 9 1 4 Evaluation 12 141 Sample Preparation 12 142 Test Procedure 13 143 Test Results 13 1 5 Discussions and Conclusion 14 1 6 Bibliography 15 2 VOICE QUALITY VARIATION IN A LONG-TERM RECORDING OF A SINGLE SPEAKER SPEECH CORPUS V I Hisashi Kawai and Minoru Tsuzaki 19 2 1 Introduction 19 2 2 Perceptual Experiment 20 221 Speech material 20 222 Stimuli 21 223 Procedure 22 224 Results 22 2 3 Factors of Voice Quality Variation 23 2 4 Candidates of Acoustic Correlates 25 241 Spectral Tilt in 0-4 kHz Band 25 242 MFCC Distance in 0-4 kHz Band 26 243 Higher Frequency Power 27 244 Peak Amplitude of Autocorrelation Coefficients 27 245 Fundamental Frequency and Speech Rate 27 246 Time Intervals 28 2 5 Prediction of Voice Quality Difference Scores 28 2 6 Summary 32 2 7 Bibliography 32 JOIN COST FOR UNIT SELECTION SPEECH SYNTHESIS Jithendra Vepa and Simon King 35 3 1 Introduction 35 3 2 Previous Work 37 321 Join Cost Functions Based on Spectral Measures 37 322 Combined Join Cost and Target Cost Functions 39 3 3 Spectral Distances 42 331 Parameterizations 43 332 Simple Distance Measures 44 333 Statistically Motivated Distance Measures 44 334 Weighted Distances 46 3 4 Perceptual Listening Tests 46 341 Test Stimuli 47 342 Test Design 48 V l l 343 Test Procedure 49 3 5 Results and Discussion 49 351 Listener Ratings 49 352 Correlations with Statistical Distances 50 353 Correlations with Weighted Distances 53 3 6 Conclusions 56 361 Weighted Sums of Join Costs 57 362 The Listening Test 58 363 Correlation as an Evaluation Tool 58 364 Future Work 59 3 7 Bibliography 59 ARTICULATORY MODELING: A ROLE IN CONCATENATIVE TEXT TO SPEECH SYNTHESIS M Mohan Sondhi and Daniel J Sinder 63 4 1 Introduction 63 4 2 Articulatory Modeling 65 421 Vocal Tract Acoustics 65 422 Articulatory Parameters 66 423 Acoustic Source Models 68 424 Synthesis from the Parameters 69 4 3 Rule-Based Control of the Parameters 74 4 4 Concatenative Articulatory Synthesis 75 441 Motivation 75 442 Terminology 76 443 Articulatory Units from Natural Speech 78 444 The Speech Mimic 79 445A Prototype TTS System 83 4 5 Concluding Remarks 84 4 6 Bibliography 85 MINIMIZING THE AMOUNT OF PITCH MODIFICATION IN SPEECH SYNTHESIS Esther Klabbers, Jan van Santen and Johan Wouters 89 5 1 Introduction 89 Vlll 5 2 Speech Corpus Analysis 92 521 Prosodic Factors 92 522 Material 95 523 Distance Measures 96 524 Results 98 5 3 Text Corpus Analysis 100 531 Material 100 532 Results 101 5 4 Perceptual Experiment 102 541 Material 102 542 Method 103 543 Results 103 5 5 Conclusion 105 5 6 Bibliography 106 6 THE USE OF SPEECH RECOGNITION TECHNOLOGY IN SPEECH SYNTHESIS Mari Ostendorf and Ivan Bulyko 109 6 1 Introduction 109 6 2 Speech Recognition 110 6 3 ASR in Synthesis 114 631 Speech Synthesis as a Search Problem 114 632 ASR Tools for Annotation 116 633 Speech Models 118 634 Adaptation for Voice Transformation 119 635 iV-grams for Text Processing and Language Generation 119 636 Statistical Models for Prosody Prediction 120 6 4 Limitations 121 6 5 Speculations 123 651 ASR and Parametric Synthesis 124 652 Can Synthesis Impact Recognition? 124 6 6 Bibliography 126 7 AN HMM-BASED APPROACH TO MULTILINGUAL SPEECH SYNTHESIS IX Keiichi Tokuda, Heiga Zen and Alan W Black 135 7 1 Introduction 135 7 2 HMM-Based Speech Synthesis System 137 721 Training 137 722 Synthesis 139 7 3 FO Pattern Modeling by HMM 140 731 HMM Based on Multispace Probability Distribution 140 732 Application to FO Pattern Modeling 142 7 4 Speech-Parameter Generation from an HMM 143 741 Speech-Parameter-Generation Algorithm 143 742 Determination of State Durations 146 743 Example of Parameter Generation 146 7 5 Implementation on Festival Architecture 146 7 6 Discussion 149 7 7 Conclusion 150 7 8 Bibliography 151 8 PROSODY CONTROL FOR HMM-BASED JAPANESE TTS Koji Iwano, Masahiro Yamada, Taro Togawa and Sadaoki Purui 155 8 1 Introduction 155 8 2 Outline of HMM-Based TTS System 156 8 3 Prosody Generation Using the Quantification Theory (Type 1) 158 831 Quantification Theory (Type 1) 158 832 FO Contour Control Model 158 833 Phoneme-Duration-Control Model 162 8 4 Speech-Rate-Variable Synthesis Method 168 841 Database 168 842 Phoneme-Duration Model Generated by Interpolation 168 843 Experiments 169 844 Experimental Results 170 8 5 Conclusions 170 8 6 Bibliography 171 9 SYNTHESIZING EXPRESSIVE SPEECH OVERVIEW: CHALLENGES, AND OPEN QUESTIONS Murtaza Bulut, Shrikanth Narayanan and Lewis Johnson 175 9 1 Introduction 175 9 2 Theories of Emotion 177 9 3 Dimensions of Emotional Space 178 9 4 Speech Synthesis Methods 180 941 Formant Synthesis (Rule-Driven Synthesis) 181 942 Concatenative Synthesis (Data-Driven Synthesis) 182 943 Articulatory Synthesis (Model-Driven Synthesis) 186 9 5 Emotional Speech Data Collection 187 951 Data Collection for Concatenative Speech Synthesis 187 9 6 Experimental Evaluation of Expressive Speech 190 9 7 Presentation of Results From Case Studies 191 9 8 Conclusion 196 9 9 Open Questions and Future Directions 196 9 10 Bibliography 197 10 UNIT SELECTION SYNTHESIS OF PROSODY: EVALUATION USING DIPHONE TRANSPLANTATION Romain Prudon, Christophe D Alessandro and Philippe Boula de Mareuil 203 10 1 Introduction 203 10 2 Computing Prosody by Selection 204 10 2 1 Databases 204 10 2 2 Selection System Architecture 205 10 2 3 Tuning Selection for Prosody Synthesis 209 10 3 Comparative Evaluation 209 10 3 1 Prosody Generation Modules 209 10 3 2 Test Methodology 211 10 4 Results 212 10 4 1 Presentation of the Results 212 10 4 2 Overall Analysis of the Results 213 10 4 3 Analysis of the Results by Sentence Length 214 10 5 Conclusion 215 10 6 Bibliography 215 XI 11 TOWARD EXPRESSIVE SYNTHETIC SPEECH Ellen Eide, Raimo Bakis, Wael Hamza and John F Pitrelli 219 11 1 Introduction 219 11 2 A Pilot Study For Generating Expressive Speech 222 11 2 1 Baseline System 222 11 2 2 Data 223 11 2 3 Experiment 1: Liveliness 224 11 2 4 Experiment 2: Sadness 225 11 2 5 Experiment 3: Anger 226 11 2 6 Experiment 4: Expression Detection 227 11 3 Generating Expressive Speech with Limited Resources 228 11 3 1 Expressive Data Collection 228 11 3 2 FO Target Estimation 229 11 3 3 Target-Duration Estimation 230 11 3 4 Results 231 11 3 5 Adaptation via Sinusoidal Modeling 232 11 4 Rule-Based Methods for Generating Expressive Speech 233 11 5 Use of an Expressive TTS System 235 11 5 1 Proposed Extensions to SSML 237 11 6 Assessing Performance 243 11 7 Conclusions 245 11 8 Bibliography 247 11 9 FOOTNOTES 249 11 10COPYRIGHT FORMS 249 11 11REFERENCES 249 INDEX 251
any_adam_object	1
building	Verbundindex
bvnumber	BV019539601
callnumber-first	T - Technology
callnumber-label	TK7882
callnumber-raw	TK7882.S65
callnumber-search	TK7882.S65
callnumber-sort	TK 47882 S65
callnumber-subject	TK - Electrical and Nuclear Engineering
ctrlnum	(OCoLC)249193154 (DE-599)BVBBV019539601
dewey-full	621.399
dewey-hundreds	600 - Technology (Applied sciences)
dewey-ones	621 - Applied physics
dewey-raw	621.399
dewey-search	621.399
dewey-sort	3621.399
dewey-tens	620 - Engineering and allied operations
discipline	Elektrotechnik / Elektronik / Nachrichtentechnik
edition	1. print.
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01551nam a2200409zc 4500</leader><controlfield tag="001">BV019539601</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20041103 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">041102s2004 xxud\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">2004010674</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">013145661X</subfield><subfield code="c">hardcover : alk. paper</subfield><subfield code="9">0-13-145661-X</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)249193154</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV019539601</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">xxu</subfield><subfield code="c">US</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK7882.S65</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">621.399</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Text to speech synthesis</subfield><subfield code="b">new paradigms and advances</subfield><subfield code="c">[edited by] Shrikanth Narayanan, Abeer Alwan</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. print.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Upper Saddle River, N.J.</subfield><subfield code="b">Prentice Hall Professional Technical Reference</subfield><subfield code="c">2004</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXIII, 257 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes bibliographical references and index</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">aSpeech synthesis</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachsynthese</subfield><subfield code="0">(DE-588)4056501-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4143413-4</subfield><subfield code="a">Aufsatzsammlung</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Sprachsynthese</subfield><subfield code="0">(DE-588)4056501-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Narayanan, Shrikanth</subfield><subfield code="e">Sonstige</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Alwan, Abeer</subfield><subfield code="e">Sonstige</subfield><subfield code="4">oth</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012907966&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-012907966</subfield></datafield></record></collection>
genre	(DE-588)4143413-4 Aufsatzsammlung gnd-content
genre_facet	Aufsatzsammlung
id	DE-604.BV019539601
illustrated	Illustrated
indexdate	2024-07-09T20:00:32Z
institution	BVB
isbn	013145661X
language	English
lccn	2004010674
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-012907966
oclc_num	249193154
open_access_boolean
owner	DE-29T
owner_facet	DE-29T
physical	XXIII, 257 S. graph. Darst.
publishDate	2004
publishDateSearch	2004
publishDateSort	2004
publisher	Prentice Hall Professional Technical Reference
record_format	marc
spelling	Text to speech synthesis new paradigms and advances [edited by] Shrikanth Narayanan, Abeer Alwan 1. print. Upper Saddle River, N.J. Prentice Hall Professional Technical Reference 2004 XXIII, 257 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Includes bibliographical references and index aSpeech synthesis Sprachsynthese (DE-588)4056501-4 gnd rswk-swf (DE-588)4143413-4 Aufsatzsammlung gnd-content Sprachsynthese (DE-588)4056501-4 s DE-604 Narayanan, Shrikanth Sonstige oth Alwan, Abeer Sonstige oth HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012907966&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis
spellingShingle	Text to speech synthesis new paradigms and advances aSpeech synthesis Sprachsynthese (DE-588)4056501-4 gnd
subject_GND	(DE-588)4056501-4 (DE-588)4143413-4
title	Text to speech synthesis new paradigms and advances
title_auth	Text to speech synthesis new paradigms and advances
title_exact_search	Text to speech synthesis new paradigms and advances
title_full	Text to speech synthesis new paradigms and advances [edited by] Shrikanth Narayanan, Abeer Alwan
title_fullStr	Text to speech synthesis new paradigms and advances [edited by] Shrikanth Narayanan, Abeer Alwan
title_full_unstemmed	Text to speech synthesis new paradigms and advances [edited by] Shrikanth Narayanan, Abeer Alwan
title_short	Text to speech synthesis
title_sort	text to speech synthesis new paradigms and advances
title_sub	new paradigms and advances
topic	aSpeech synthesis Sprachsynthese (DE-588)4056501-4 gnd
topic_facet	aSpeech synthesis Sprachsynthese Aufsatzsammlung
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012907966&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT narayananshrikanth texttospeechsynthesisnewparadigmsandadvances AT alwanabeer texttospeechsynthesisnewparadigmsandadvances

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge