Statistical methods for speech recognition:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cambridge, Mass. [u.a.]
MIT Press
1997
|
Schriftenreihe: | Language, speech and communication
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXI, 283 S. graph. Darst. |
ISBN: | 0262100665 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV011755794 | ||
003 | DE-604 | ||
005 | 19980603 | ||
007 | t | ||
008 | 980205s1997 d||| |||| 00||| eng d | ||
020 | |a 0262100665 |9 0-262-10066-5 | ||
035 | |a (OCoLC)246976657 | ||
035 | |a (DE-599)BVBBV011755794 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-29 |a DE-29T |a DE-739 |a DE-19 |a DE-188 |a DE-355 | ||
100 | 1 | |a Jelinek, Frederick |d 1932- |e Verfasser |4 aut | |
245 | 1 | 0 | |a Statistical methods for speech recognition |c Frederick Jelinek |
264 | 1 | |a Cambridge, Mass. [u.a.] |b MIT Press |c 1997 | |
300 | |a XXI, 283 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Language, speech and communication | |
650 | 0 | 7 | |a Statistik |0 (DE-588)4056995-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Statistische Analyse |0 (DE-588)4116599-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |D s |
689 | 0 | 1 | |a Statistik |0 (DE-588)4056995-0 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |D s |
689 | 1 | 1 | |a Statistische Analyse |0 (DE-588)4116599-8 |D s |
689 | 1 | |8 1\p |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=007933262&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-007933262 | ||
883 | 1 | |8 1\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk |
Datensatz im Suchindex
_version_ | 1804126299927609344 |
---|---|
adam_text | Contents
Preface
xix
Chapter
1________________________
The Speech Recognition
РгоЫет
1 1.1
Introduction
1
1.2
A Mathematical
Formulation
4
1.3
Components of a Speech
Recognizer
5
1.3.1
Acoustic Processing
5
1.3.2
Acoustic Modeling
7
1.3.3
Language Modeling
8
1.3.4
Hypothesis Search
8
1.3.5
The Source-Channel Model of
Speech Recognition
9
Chapter
2
НШеш
Marker Modeb IS
1.4
About This Book
9
1.5
Vector Quantization
10
1.6
Additional Reading
12
References
12
2.1
About Markov Chains
15
2.2
The Hidden Markov Model
Concept
17
VIU
Contents
2.3
The Trellis
19
2.4
Search for the Likeliest State
Transition Sequence
21
2.5
Presence of Null
Transitions
23
2.6
Dealing with an
HMM
That Has
Null Transitions That Do Not Form a
Loop
25
2.7
Estimation of Statistical
Parameters of HMMs
27
2.8
Practical Need for
Normalization
33
2.9
Alternative Definitions of
HMMs
35
2.9.1
HMMs Output ting Real
Numbers
35
2.9.2
HMM
Outputs Attached to
States
35
2.10
Additional Reading
36
Chapter
3______________
The AcMstk Moid 39~
References
37
3.1
Introduction
39
3.2
Phonetic Acoustic Models
40
3.3
More on Acoustic Model
Training
43
3.4
The Effect of Context
44
3.5
Viterbi Alignment
45
Contents
їх
3.6
Singleton Fenonic Base
Forms
45
3.7
A Needed Generalization
47
3.8
Generation of Synthetic Base
Forms
48
3.9
A Further Refinement
51
3.10
Singleton Base Forms for Words
Outside the Vocabulary
52
3.11
Additional Reading
52
References
54
Chapter
4
Basic Language Modeling
57 4.1
Introduction
57
4.2
Equivalence Classification of
History
59
4.3
The Trigram Language
Model
60
4.4
Optimal Linear
Smoothing
62
4.5
An Example of a Trigram
Language Model
66
4.6
Practical Aspects of Deleted
Interpolation
66
4.7
Backing-Off
69
4.8
HMM
Tagging
70
Contents
Chapters
The Viterbi
Search
79
4.9
Use of Tag Equivalence
Classification in a Language
Model
72
4.10
Vocabulary Selection and
Personalization from Text
Databases
73
4.11
Additional Reading
75
References
76
5.1
Introduction
79
5.2
Finding the Most Likely Word
Sequence
79
5.3
The Beam Search
81
5.4
Successive Language Model
Refinement Search
84
Chapter
6
Hypothesis Search on a Tree and the
Fast Match
93
5.5
Search versus Language Model
State Spaces
86
5.6
ЛГ
-Best
Search
86
5.7
A Maximum Probability
Lattice
89
5.8
Additional Reading
90
References
90
6.1
Introduction
93
6.2
Tree Search versus Trellis
(Viterbi) Search
95
Contents xi
6.3
A* Search
95
6.4
Stack Algorithm for Speech
Recognition
97
6.5
Modifications of the Tree
Search
99
6.6
Multiple-Stack; Search
99
6.6.1
First Algorithm
100
6.6.2
A Multistack Algorithm
101
6.6.3
Actual Multistack
Algorithm
102
6.7
Fast Match
103
6.8
The Cost of Search
Shortcuts
109
6.9
Additional Reading
110
References
110
Chapter?
Elements of
Information
Theory
113 7.1
Introduction
113
7.2
Functional Form of the Basic
Information Measure
114
7.3
Some Mathematical Properties of
Entropy
119
7.4
An Alternative Point of View and
Notation
123
7.5
A Source-Coding
Theorem
126
7.6
A Brief Digression
132
Xli
Contents
7.7 Mutual Information 132
7.8
Additional Reading
135
References
135
Chapter
8
The Complexity of Tasks—The Quality
8.1
The Problem with Estimation of
of Language
Modets
137
Recognition Task Complexity
137
8.2
The Shannon Game
139
8.3
Perplexity
141
8.4
The Conditional Entropy of the
System
142
8.5
Additional Reading
144
References
145
Chapter
9
The Expectation-Maximization
9.1
Introduction
147
Algorithm and Its Consequences
147
9.2
The EM Theorem
147
9.3
The Baum-Welch
Algorithm
149
9.4
Real Vector Outputs of the
Acoustic Processor
152
9.4.1
Development
for Two
Dimensions
152
9.4.2
The Generalization to
к
Dimensions
158
9.5
Constant and Tied
Parameters
158
9.5.1
Keeping Some Parameters
Constant
158
Contents
9.5.2
Tying of
Parameter
Sets
159
9.6
Tied Mixtures
161
9.7
Additional Reading
163
Chapter
10
Decision Trees and Tree Language
Modeis 16S
References
163
10.1
Introduction
165
10.2
Application of Decision Trees to
Language Modeling
166
10.3
Decision Tree Example
166
10.4
What Questions?
168
10.5
The Entropy Goodness
Criterion for the Selection of
Questions, and a Stopping Rule
170
10.6
A Restricted Set of
Questions
172
10.7
Selection of Questions by Chou s
Method
173
10.8
Selection of the Initial Split
of a Set
У
into Complementary
Subsets
176
10.9
The Two-ing Theorem
177
10.10
Practical Considerations of
Chou s Method
179
10.10.1
Problem of 0s in the
q-Distribution: The
Gini
Index
179
10.10.2
Equivalence Classification
Induced by Decision Trees
182
Contents
10.10.3
Computationally Feasible
Specification of Decision Tree
Equivalence Classes
183
10.11
Construction of Decision Trees
Based on Word Encoding
184
10.12
A Hierarchical Qassification of
Vocabulary Words
186
10.13
More on Decision Trees Based
on Word Encoding
188
10.13.1
Implementing Hierarchical
Word Classification
188
10.13.2
Predicting Encoded Words
One Bit at a Time
189
10.13.3
Treatment of Unseen Training
Data
190
10.13.4
Problems and Advantages of
Word Encoding
190
10.14
Final Remarks on the Decision
Tree Method
191
10.14.1
Smoothing
191
10.14.2
Fragmentation of Data
192
10.15
Additional Reading
193
Chapter
U
Phonetics
hom
Orthography: SpeDhg-
to-Base
Fonu
Mappings
197
References
194
11.1
Overview of Base Form
Generation from Spelling
197
11.2
Generating Alignment
Data
199
11.3
Decision Tree Classification of
Phonetic Environments
201
11.4
Finding the Base Forms
204
Contents
XV
Chapter
12
Triphones and Allophones
207
Chapter
13
Maximon
Entropy Probability
Estimádon
and
Langnage
Modeb
219
11.5
Additional Reading
204
References
205
12.1
Introduction
207
12.2
Triphones
208
12.3
The General Method
210
12.4
Collecting Realizations of
Particular Phones
210
12.5
A Direct Method
211
12.6
The Consequences
214
12.7
Back to Triphones
215
12.8
Additional Reading
217
References
218
13.1
Outline of the Maximum
Entropy Approach
219
13.2
The Main Idea
220
13.3
The General Solution
221
13.4
The Practical Problem
ΈΓ
13.5
An Example
224
13.6
A Trigram Language
Model
227
13.7
Limiting Computation
228
XVI
Contents
Chapter
14
Three Applications of
Махітшв
Entropy Estimation to
Langnage
Modeling
245
13.8
Iterative Scaling
231
13.9
The Problem of Finding
Appropriate Constraints
233
13.10
Weighting of Diverse Evidence:
Voting
234
13.11
Limiting Data Fragmentation:
Multiple Decision Trees
236
13.11.1
Combining Different
Knowledge Sources
236
13.11.2
Spontaneous Multiple Tree
Development
238
13.12
Remaining Unsolved
Problems
240
13.13
Additional Reading
241
References
242
14.1
About the Applications
245
14.2
Simple Language Model
Adaptation to a New Domain
246
14.3
A More Complex
Adaptation
248
14.4
A Dynamic Language Model:
Triggers
251
14.5
The Cache Language
Model
253
14.6
Additional Reading
255
References
255
Contents
Chapter
15
Estimation of Probabilities
ñOm
Connts
and the Back-Off Method
257
XVII
15.1
Inadequacy of Relative
Frequency Estimates
257
15.2
Estimation of Probabilities from
Counts Using Held-Out Data
258
15.2.1
The Basic Idea
258
15.2.2
The Estimation
259
15.2.3
Deciding the Value of
M
261
15.3
Universality of the Held-Out
Estimate
262
15.4
The Good-Turing
Estimate
263
15.5
Applicability of the Held-Out
and Good-Turing Estimates
265
15.6
Enhancing Estimation
Methods
268
15.6.1
Frequency Enhancement of
Held-Out Estimation ofBigrams
268
15.6.2
Frequency Enhancement
of Good-Turing Estimation of
Digrams
269
15.6.3
Other Enhancements
270
15.7
The Back-Off Language
Model
271
15.8
Additional Reading
273
References
274
Name Index
275
Subject Index
279
|
any_adam_object | 1 |
author | Jelinek, Frederick 1932- |
author_facet | Jelinek, Frederick 1932- |
author_role | aut |
author_sort | Jelinek, Frederick 1932- |
author_variant | f j fj |
building | Verbundindex |
bvnumber | BV011755794 |
ctrlnum | (OCoLC)246976657 (DE-599)BVBBV011755794 |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01671nam a2200397 c 4500</leader><controlfield tag="001">BV011755794</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">19980603 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">980205s1997 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0262100665</subfield><subfield code="9">0-262-10066-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)246976657</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV011755794</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29</subfield><subfield code="a">DE-29T</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-355</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Jelinek, Frederick</subfield><subfield code="d">1932-</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Statistical methods for speech recognition</subfield><subfield code="c">Frederick Jelinek</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge, Mass. [u.a.]</subfield><subfield code="b">MIT Press</subfield><subfield code="c">1997</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXI, 283 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Language, speech and communication</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistische Analyse</subfield><subfield code="0">(DE-588)4116599-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="1"><subfield code="a">Statistische Analyse</subfield><subfield code="0">(DE-588)4116599-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=007933262&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-007933262</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection> |
id | DE-604.BV011755794 |
illustrated | Illustrated |
indexdate | 2024-07-09T18:15:15Z |
institution | BVB |
isbn | 0262100665 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-007933262 |
oclc_num | 246976657 |
open_access_boolean | |
owner | DE-29 DE-29T DE-739 DE-19 DE-BY-UBM DE-188 DE-355 DE-BY-UBR |
owner_facet | DE-29 DE-29T DE-739 DE-19 DE-BY-UBM DE-188 DE-355 DE-BY-UBR |
physical | XXI, 283 S. graph. Darst. |
publishDate | 1997 |
publishDateSearch | 1997 |
publishDateSort | 1997 |
publisher | MIT Press |
record_format | marc |
series2 | Language, speech and communication |
spelling | Jelinek, Frederick 1932- Verfasser aut Statistical methods for speech recognition Frederick Jelinek Cambridge, Mass. [u.a.] MIT Press 1997 XXI, 283 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Language, speech and communication Statistik (DE-588)4056995-0 gnd rswk-swf Statistische Analyse (DE-588)4116599-8 gnd rswk-swf Automatische Spracherkennung (DE-588)4003961-4 gnd rswk-swf Automatische Spracherkennung (DE-588)4003961-4 s Statistik (DE-588)4056995-0 s DE-604 Statistische Analyse (DE-588)4116599-8 s 1\p DE-604 Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=007933262&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk |
spellingShingle | Jelinek, Frederick 1932- Statistical methods for speech recognition Statistik (DE-588)4056995-0 gnd Statistische Analyse (DE-588)4116599-8 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd |
subject_GND | (DE-588)4056995-0 (DE-588)4116599-8 (DE-588)4003961-4 |
title | Statistical methods for speech recognition |
title_auth | Statistical methods for speech recognition |
title_exact_search | Statistical methods for speech recognition |
title_full | Statistical methods for speech recognition Frederick Jelinek |
title_fullStr | Statistical methods for speech recognition Frederick Jelinek |
title_full_unstemmed | Statistical methods for speech recognition Frederick Jelinek |
title_short | Statistical methods for speech recognition |
title_sort | statistical methods for speech recognition |
topic | Statistik (DE-588)4056995-0 gnd Statistische Analyse (DE-588)4116599-8 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd |
topic_facet | Statistik Statistische Analyse Automatische Spracherkennung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=007933262&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT jelinekfrederick statisticalmethodsforspeechrecognition |