Verfügbarkeit: Automatic speech recognition

Automatic speech recognition: the development of the SPHINX system

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Li, Kaifu 1961- (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Boston [u.a.] Kluwer Acad. Publ. 1989
Schriftenreihe:	The Kluwer international series in engineering and computer science 62
Schlagworte:	Automatic speech recognition Automatische Spracherkennung
Online-Zugang:	Inhaltsverzeichnis
Beschreibung:	Literaturverz. S. [187] - 203
Beschreibung:	XIV, 207 S. graph. Darst.
ISBN:	0898382963

Internformat

MARC


LEADER	00000nam a2200000 cb4500
001	BV004160172
003	DE-604
005	20190228
007	t\|
008	901119s1989 xx d\|\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 0898382963 \|9 0-89838-296-3
035			\|a (OCoLC)632713873
035			\|a (DE-599)BVBBV004160172
040			\|a DE-604 \|b ger \|e rakddb
041	0		\|a eng
049			\|a DE-91 \|a DE-739 \|a DE-19 \|a DE-355 \|a DE-83
050		0	\|a TK7882.S65
082	0		\|a 006.4/54 \|2 19
084			\|a ES 945 \|0 (DE-625)27935: \|2 rvk
084			\|a ST 306 \|0 (DE-625)143654: \|2 rvk
084			\|a ELT 532f \|2 stub
100	1		\|a Li, Kaifu \|d 1961- \|e Verfasser \|0 (DE-588)1176434330 \|4 aut
245	1	0	\|a Automatic speech recognition \|b the development of the SPHINX system \|c by Kai-Fu Lee
264		1	\|a Boston [u.a.] \|b Kluwer Acad. Publ. \|c 1989
300			\|a XIV, 207 S. \|b graph. Darst.
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
490	1		\|a The Kluwer international series in engineering and computer science \|v 62
500			\|a Literaturverz. S. [187] - 203
650		4	\|a Automatic speech recognition
650	0	7	\|a Automatische Spracherkennung \|0 (DE-588)4003961-4 \|2 gnd \|9 rswk-swf
689	0	0	\|a Automatische Spracherkennung \|0 (DE-588)4003961-4 \|D s
689	0		\|5 DE-604
830		0	\|a The Kluwer international series in engineering and computer science \|v 62 \|w (DE-604)BV023545171 \|9 62
856	4	2	\|m Digitalisierung UB Regensburg \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=002594336&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
943	1		\|a oai:aleph.bib-bvb.de:BVB01-002594336

Datensatz im Suchindex

_version_	1817704477548871680
adam_text	Table of Contents Table of Contents v List of Figures ix List of Tables xi Foreword by Raj Reddy *»" Ackowledgements xv 1. Introduction 1 1.1. Constrained Speech Recognition: Achievements and Limitations 2 1.1.1. Speaker Independence 3 1.1.2. Continuous Speech 6 1.1.3. Large Vocabulary 8 1.1.4. Natural Task 8 1.2. Relaxing the Constraints: The SPHINX System 9 1.2.1. Hidden Markov Models: A Representation of Speech 10 1.2.2. Adding Human Knowledge 11 1.2.3. Finding a Good Unit of Speech 12 1.2.4. Speaker Learning and Adaptation 14 1.3. Summary and Monograph Outline 15 2. Hidden Markov Modeling of Speech 17 2.1. Definition of a Hidden Markov Model 17 2.2. Three HMM Problems 19 2.2.1. The Evaluation Problem : The Forward Algorithm 20 2.2.2. The Decoding Problem: The Viterbi Algorithm 22 2.23. The Learning Problem : The Forward-Backward 23 Algorithm 2.3. Implementational Issues 26 23.1. Tied Transition 26 2.3.2. Null Transitions 27 233. Initialization 27 23.4. Scaling or Log Compression 28 23.5. Multiple Independent Observations 30 23.6. Smoothing 30 2.4. Using HMMs for Speech Recognition 32 2.4.1. Representation 32 2.4.1.1. Continuous vs. Discrete Model 32 2.4.1.2. HMM Representation of Speech Units 34 2.4.1.3. HMM Representation of Other Knowledge Sources 36 2.42. Using HMM for Isolated Word Tasks 36 2.4.2.1. Training 36 2.4.2.2. Recognition 37 2.43. Using HMM for Continuous Speech Tasks 38 2.43.1. Training 38 vi AUTOMATIC SPEECH RECOGNITION 2.4.3.2. Recognition 39 3. Task and Databases 45 3.1. The Resource Management Task and Database 45 3.1.1. The Vocabulary 45 3.1.2. The Grammar 46 3.13. The TIRM Database 47 3.2. The TIMIT Database 48 4. The Baseline SPHINX System 51 4.1. Signal Processing 51 4.2. Vector Quantization 52 4.2.1. The Distortion Measure 52 4.2.2. A Hierarchical VQ Algorithm 53 4.3. The Phone Model 54 4.4. The Pronunciation Dictionary 55 4.5. HMM Training 56 4.6. HMM Recognition 59 4.7. Results and Discussion 60 4.8. Summary 62 5. Adding Knowledge 63 5.1. Fixed-Width Speech Parameters 64 5.1.1. Bilinear Transform on the Cepstrum Coefficients 64 5.1.2. Differenced Cepstrum Coefficients 65 5.1.3. Power and Differenced Power 66 5.1.4. Integrating Frame-Based Parameters 67 5.1.4.1. Stack and Reduce 67 5.1.4.2. Composite Distance Metric 68 5.1.4.3. Multiple Codebooks 69 5.2. Variable-Width Speech Parameters 72 5.2.1. Duration 72 5.2.2. Knowledge-based Parameters 75 5.3. Lexical/Phonological Improvements 75 5.3.1. Insertion/Deletion Modeling 77 53.2. Multiple Pronunciations 79 533. Other Dictionary/Phone-Set Improvements 81 53.3.1. Phonological Rules 81 533.2. Non-Phonemic Affricates 81 533.3. Tailoring HMM Topology 82 533.4. Final Phone Set and Dictionary 83 5.4. Results and Discussion 84 5.5. Summary 88 6. Finding a Good Unit of Speech 91 6.1. Previously Proposed Units of Speech 91 6.1.1. Words 91 6.1.2. Phones 92 TABLE OF CONTENTS VU 6.13. Multi-Phone Units 93 6.1.4. Explicit Transition Modeling 94 6.1.5. Word-Dependent Phones 95 6.1.6. Triphones (Context-Dependent Phones) 95 6.1.7. Summary of Previous Units 97 6.2. Deleted Interpolation of Contextual Models 97 6.3. Function-Word-Dependent Phones 100 6.4. Generalized Triphones 103 6.5. Summary of SPHINX Training Procedure 106 6.6. Results and Discussion 107 6.7. Summary 111 7. Learning and Adaptation 115 7.1. Speaker Adaptation through Speaker Cluster Selection 116 7.1.1. Speaker Clustering 117 7.1.2. Speaker Cluster Identification 118 7.2. Interpolated Re-estimation of HMM Parameters 118 7.2.1. Different Speaker-Adaptive Estimates 119 7.2.2. Interpolated Re-estimation 122 7.3. Results and Discussion 124 7.4. Summary 126 8. Summary of Results 129 8.1. SPHINX Results 129 8.2. Comparison with Other Systems 131 8.3. Error Analysis 133 9. Conclusion 137 9.1. Trainability vs. Specificity : A Unified View 137 9.2. Contributions 138 9.3. Future Work 141 9.4. Final Remarks 143 Appendix I. Evaluating Speech Recognizers 145 1.1. Perplexity 145 1.2. Computing Error Rate 146 Appendix П. The Resource Management Task 149 ILI. The Vocabulary and the SPHINX Pronunciation Dictionary 149 II.2. The Grammar 170 IL3. Training and Test Speakers 170 Appendix Ш. Examples of SPHINX Recognition 173 References 187 Index 205
any_adam_object	1
author	Li, Kaifu 1961-
author_GND	(DE-588)1176434330
author_facet	Li, Kaifu 1961-
author_role	aut
author_sort	Li, Kaifu 1961-
author_variant	k l kl
building	Verbundindex
bvnumber	BV004160172
callnumber-first	T - Technology
callnumber-label	TK7882
callnumber-raw	TK7882.S65
callnumber-search	TK7882.S65
callnumber-sort	TK 47882 S65
callnumber-subject	TK - Electrical and Nuclear Engineering
classification_rvk	ES 945 ST 306
classification_tum	ELT 532f
ctrlnum	(OCoLC)632713873 (DE-599)BVBBV004160172
dewey-full	006.4/54
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	006 - Special computer methods
dewey-raw	006.4/54
dewey-search	006.4/54
dewey-sort	16.4 254
dewey-tens	000 - Computer science, information, general works
discipline	Informatik Sprachwissenschaft Elektrotechnik Literaturwissenschaft
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 cb4500</leader><controlfield tag="001">BV004160172</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20190228</controlfield><controlfield tag="007">t\|</controlfield><controlfield tag="008">901119s1989 xx d\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0898382963</subfield><subfield code="9">0-89838-296-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)632713873</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV004160172</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-83</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK7882.S65</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.4/54</subfield><subfield code="2">19</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 945</subfield><subfield code="0">(DE-625)27935:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ELT 532f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Li, Kaifu</subfield><subfield code="d">1961-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1176434330</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Automatic speech recognition</subfield><subfield code="b">the development of the SPHINX system</subfield><subfield code="c">by Kai-Fu Lee</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boston [u.a.]</subfield><subfield code="b">Kluwer Acad. Publ.</subfield><subfield code="c">1989</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIV, 207 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">The Kluwer international series in engineering and computer science</subfield><subfield code="v">62</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturverz. S. [187] - 203</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Automatic speech recognition</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">The Kluwer international series in engineering and computer science</subfield><subfield code="v">62</subfield><subfield code="w">(DE-604)BV023545171</subfield><subfield code="9">62</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=002594336&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-002594336</subfield></datafield></record></collection>
id	DE-604.BV004160172
illustrated	Illustrated
indexdate	2024-12-06T15:14:35Z
institution	BVB
isbn	0898382963
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-002594336
oclc_num	632713873
open_access_boolean
owner	DE-91 DE-BY-TUM DE-739 DE-19 DE-BY-UBM DE-355 DE-BY-UBR DE-83
owner_facet	DE-91 DE-BY-TUM DE-739 DE-19 DE-BY-UBM DE-355 DE-BY-UBR DE-83
physical	XIV, 207 S. graph. Darst.
publishDate	1989
publishDateSearch	1989
publishDateSort	1989
publisher	Kluwer Acad. Publ.
record_format	marc
series	The Kluwer international series in engineering and computer science
series2	The Kluwer international series in engineering and computer science
spelling	Li, Kaifu 1961- Verfasser (DE-588)1176434330 aut Automatic speech recognition the development of the SPHINX system by Kai-Fu Lee Boston [u.a.] Kluwer Acad. Publ. 1989 XIV, 207 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier The Kluwer international series in engineering and computer science 62 Literaturverz. S. [187] - 203 Automatic speech recognition Automatische Spracherkennung (DE-588)4003961-4 gnd rswk-swf Automatische Spracherkennung (DE-588)4003961-4 s DE-604 The Kluwer international series in engineering and computer science 62 (DE-604)BV023545171 62 Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=002594336&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis
spellingShingle	Li, Kaifu 1961- Automatic speech recognition the development of the SPHINX system The Kluwer international series in engineering and computer science Automatic speech recognition Automatische Spracherkennung (DE-588)4003961-4 gnd
subject_GND	(DE-588)4003961-4
title	Automatic speech recognition the development of the SPHINX system
title_auth	Automatic speech recognition the development of the SPHINX system
title_exact_search	Automatic speech recognition the development of the SPHINX system
title_full	Automatic speech recognition the development of the SPHINX system by Kai-Fu Lee
title_fullStr	Automatic speech recognition the development of the SPHINX system by Kai-Fu Lee
title_full_unstemmed	Automatic speech recognition the development of the SPHINX system by Kai-Fu Lee
title_short	Automatic speech recognition
title_sort	automatic speech recognition the development of the sphinx system
title_sub	the development of the SPHINX system
topic	Automatic speech recognition Automatische Spracherkennung (DE-588)4003961-4 gnd
topic_facet	Automatic speech recognition Automatische Spracherkennung
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=002594336&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
volume_link	(DE-604)BV023545171
work_keys_str_mv	AT likaifu automaticspeechrecognitionthedevelopmentofthesphinxsystem

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge