Speech and audio signal processing: processing and perception of speech and music
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
New York [u.a.]
Wiley
2000
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XVIII, 537 S. Ill., graph. Darst. |
ISBN: | 0471351547 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV013148731 | ||
003 | DE-604 | ||
005 | 20000515 | ||
007 | t | ||
008 | 000511s2000 ad|| |||| 00||| eng d | ||
020 | |a 0471351547 |9 0-471-35154-7 | ||
035 | |a (OCoLC)318207673 | ||
035 | |a (DE-599)BVBBV013148731 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-29T |a DE-703 |a DE-1102 |a DE-573 |a DE-634 |a DE-83 | ||
050 | 0 | |a TK7882.S65 | |
082 | 0 | |a 621.382/2 |2 21 | |
084 | |a ZN 6060 |0 (DE-625)157500: |2 rvk | ||
100 | 1 | |a Gold, Ben |e Verfasser |4 aut | |
245 | 1 | 0 | |a Speech and audio signal processing |b processing and perception of speech and music |c Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ... |
264 | 1 | |a New York [u.a.] |b Wiley |c 2000 | |
300 | |a XVIII, 537 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 4 | |a Música electrónica | |
650 | 4 | |a Procesamiento de señales - Técnicas digitales | |
650 | 4 | |a Sistemas de procesamiento de la voz | |
650 | 0 | 7 | |a Digitale Sprachverarbeitung |0 (DE-588)4233857-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Sprachproduktion |0 (DE-588)4143703-2 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Digitale Sprachverarbeitung |0 (DE-588)4233857-8 |D s |
689 | 0 | 1 | |a Automatische Sprachproduktion |0 (DE-588)4143703-2 |D s |
689 | 0 | |8 1\p |5 DE-604 | |
689 | 1 | 0 | |a Digitale Sprachverarbeitung |0 (DE-588)4233857-8 |D s |
689 | 1 | |5 DE-604 | |
700 | 1 | |a Morgan, Nelson |e Verfasser |4 aut | |
856 | 4 | 2 | |m GBV Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=008958038&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-008958038 | ||
883 | 1 | |8 1\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk |
Datensatz im Suchindex
_version_ | 1804127847705477120 |
---|---|
adam_text | BEN GOLD MASSACHUSETTS INSTITUTE OF TECHNOLOGY LINCOLN LABORATORY NELSON
MORGAN UNIVERSITY OF CALIFORNIA AT BERKELEY INTERNATIONAL COMPUTER
SCIENCE INSTITUTE WITH CONTRIBUTIONS FROM HERVE BOURLARD ERIC
FOSLER-LUSSIER JEFF GILBERT JOHN WILEY & SONS, INC. NEW YORK /
CHICHESTER / WEINHEIM / BRISBANE / SINGAPORE / TORONTO CHAPTER 1
INTRODUCTION 1 1.1 WHY WE WROTE THIS BOOK 1 1.2 HOW TO USE THIS BOOK 2
1.3 A CONFESSION 4 1.4 ACKNOWLEDGMENTS 4 CHAPTER 2 SYNTHETIC AUDIO; A
BRIEF HISTORY 9 2.1 VON KEMPELEN 9 2.2 THE VODER 9 2.3 TEACHING THE
OPERATOR TO MAKE THE VODER TALK 11 2.4 SPEECH SYNTHESIS AFTER THE VODER
14 2.5 MUSIC MACHINES 14 2.6 EXERCISES 17 CHAPTER 3 SPEECH ANALYSIS AND
SYNTHESIS OVERVIEW 20 3.1 BACKGROUND 20 3.1.1 TRANSMISSION OF ACOUSTIC
SIGNALS 20 3.1.2 ACOUSTICAL TELEGRAPHY BEFORE MORSE CODE 21 3.1.3 THE
TELEPHONE 22 3.1.4 THE CHANNEL VOCODER AND BANDWIDTH COMPRESSION 22 3.2
VOICE-CODING CONCEPTS 24 3.3 HOMER DUDLEY (1898-1981) 28 3.4 EXERCISES
35 3.5 APPENDIX: HEARING OF THE FALL OF TROY 36 CHAPTER 4 BRIEF HISTORY
OF A UTOMATIC SPEECH RECOGNITION 39 4.1 RADIO REX 39 4.2 DIGIT
RECOGNITION 40 4.3 SPEECH RECOGNITION IN THE 1950S 42 VII VIII CONTENTS
4.4 THE 1960S 42 4.4.1 SHORT-TERM SPECTRAL ANALYSIS 44 4.4.2 PATTERN
MATCHING 44 4.5 1971-1976 ARPA PROJECT 45 4.6 ACHIEVED BY 1976 45 4.7
THE 1980S IN AUTOMATIC SPEECH RECOGNITION 46 4.7.1 LARGE CORPORA
COLLECTION 46 4.7.2 FRONT ENDS 47 4.7.3 HIDDEN MARKOV MODELS 47 4.7.4
THE SECOND (D)ARPA SPEECH-RECOGNITION PROGRAM 48 4.7.5 THE RETURN OF
NEURAL NETS 49 4.7.6 KNOWLEDGE-BASED APPROACHES 50 4.8 RECENT WORK 50
4.8 SOME LESSONS 51 4.9 EXERCISES 52 CHAPTERS SPEECH-RECOGNITION
OVERVIEW 56 5.1 WHY STUDY AUTOMATIC SPEECH RECOGNITION? 56 5.2 WHY IS
AUTOMATIC SPEECH RECOGNITION HARD? 57 5.3 AUTOMATIC SPEECH RECOGNITION
DIMENSIONS 59 5.3.1 TASK PARAMETERS 59 5.3.2 SAMPLE DOMAIN: LETTERS OF
THE ALPHABET 61 5.4 COMPONENTS OF AUTOMATIC SPEECH RECOGNITION 61 5.5
FINAL COMMENTS 64 5.6 EXERCISES 65 PART II __ MATHEMATICAL BACKGROUND
CHAPTER 6 DIGITAL SIGNAL PROCESSING 6 9 6.1 INTRODUCTION 69 6.2 THE Z
TRANSFORM 69 6.3 INVERSE Z TRANSFORM 70 6.4 CONVOLUTION 71 6.5 SAMPLING
72 6.6 LINEAR DIFFERENCE EQUATIONS 73 6.7 FIRST-ORDER LINEAR DIFFERENCE
EQUATIONS 74 6.8 RESONANCE 75 CONTENTS IX 6.9 CONCLUDING COMMENTS 79
6.10 EXERCISES 79 CHAPTER 7 DIGITAL FILTERS AND DISCRETE FOURIER
TRANSFORM 83 7.1 INTRODUCTION 83 7.2 FILTERING CONCEPTS 84 7.3 USEFUL
FILTER FUNCTIONS 88 7.4 TRANSFORMATIONS FOR DIGITAL FILTER DESIGN 90 7.5
DIGITAL FILTER DESIGN WITH BILINEAR TRANSFORMATION 91 7.6 THE DISCRETE
FOURIER TRANSFORM 92 7.7 FAST FOURIER TRANSFORM METHODS 95 7.8 RELATION
BETWEEN THE DFT AND DIGITAL FILTERS 98 7.9 EXERCISES 100 CHAPTERS
PATTERN CLASSIFICATION 103 8.1 INTRODUCTION 103 8.2 FEATURE EXTRACTION
105 8.2.1 SOME OPINIONS 10 6 8.3 PATTERN-CLASSIFICATION METHODS 107
8.3.1 MINIMUM DISTANCE CLASSIFIERS 107 8.3.2 DISCRIMINANT FUNCTIONS 109
8.3.3 GENERALIZED DISCRIMINATORS 110 8.4 EXERCISES 113 8.5 APPENDIX:
MULTILAYER PERCEPTION TRAINING 114 8.5.1 DEFINITIONS 114 8.5.2
DERIVATION 115 CHAPTER 9 STATISTICAL PATTERN CLASSIFICATION 119 * 9.1
INTRODUCTION 119 9.2 A FEW DEFINITIONS 119 9.3 CLASS-RELATED PROBABILITY
FUNCTIONS 120 9.4 MINIMUM ERROR CLASSIFICATION 121 9.5 LIKELIHOOD-BASED
MAP CLASSIFICATION 122 9.6 APPROXIMATING A BAYES CLASSIFIER 123 9.7
STATISTICALLY BASED LINEAR DISCRIMINANTS 125 9.7.1 DISCUSSION 126 9.8
ITERATIVE TRAINING: THE EM ALGORITHM 126 9.8.1 DISCUSSION 131 9.9
EXERCISES 132 X CONTENTS * * . . . . * . * * . . . . . * * . * * * * * .
. * * * * * * . * : . : * . . : * * * * * * . . . * . * * * . ; : : *
: *:*;*;**: * . * * . . . . . . . . . . . . . . . . . * * * * . . . . .
* * . . * . . . . . . . . . . . . * . * . : : * ; * CHAPTER 10 WAVE
BASICS 137 10.1 INTRODUCTION 137 10.2 THE WAVE EQUATION FOR THE
VIBRATING STRING 137 10.3 DISCRETE-TIME TRAVELING WAVES 139 10.4
BOUNDARY CONDITIONS AND DISCRETE TRAVELING WAVES 140 10.5 STANDING WAVES
140 10.6 DISCRETE-TIME MODELS OF ACOUSTIC TUBES 141 10.7 ACOUSTIC TUBE
RESONANCES 143 10.8 RELATION OF ACOUSTIC TUBE RESONANCES TO OBSERVED
FORMANT FREQUENCIES 144 10.9 EXERCISES 146 CHAPTER 11 ACOUSTIC TUBE
MODELING OF SPEECH PRODUCTION 148 11.1 INTRODUCTION 148 11.2 ACOUSTIC
TUBE MODELS OF ENGLISH PHONEMES 148 11.3 EXCITATION MECHANISMS IN SPEECH
PRODUCTION 152 11.4 EXERCISES 153 CHAPTER 12 MUSIC PRODUCTION 154 12.1
INTRODUCTION 154 12.2 SEQUENCE OF STEPS IN A PLUCKED OR BOWED STRING
INSTRUMENT 155 12.3 VIBRATIONS OF THE BOWED STRING 155 12.4
FREQUENCY-RESPONSE MEASUREMENTS OF THE BRIDGE OF A VIOLIN 156 12.5
VIBRATIONS OF THE BODY OF STRING INSTRUMENTS: MEASUREMENT METHODS 159
12.6 RADIATION PATTERN OF BOWED STRING INSTRUMENTS 163 12.7 SOME
CONSIDERATIONS IN PIANO DESIGN 165 12.8 BRIEF DISCUSSION OF THE TRUMPET,
TROMBONE, FRENCH HORN, AND TUBA 171 12.9 EXERCISES 173 CHAPTER 13 ROOM
ACOUSTICS 175 13.1 SOUNDWAVES 175 13.1.1 ONE-DIMENSIONAL WAVE EQUATION
176 13.1.2 SPHERICAL WAVE EQUATION 177 13.1.3 INTENSITY 177 13.1.4
DECIBEL SOUND LEVELS 178 13.1.5 TYPICAL POWER SOURCES 178 CONTENTS XI
13.2 13.3 13.4 SOUND WAVES IN ROOMS 179 13.2.1 ACOUSTIC REVERBERATION
180 13.2.2 EARLY REFLECTIONS 183 ROOM ACOUSTICS AS A COMPONENT IN SPEECH
SYSTEMS 184 EXERCISES 185 PART IV AUDITORY PERCEPTION CHAPTER 14 EAR
PHYSIOLOGY 189 14.1 INTRODUCTION 189 14.2 ANATOMICAL PATHWAYS FROM THE
EAR TO THE PERCEPTION OF SOUND 189 14.3 THE PERIPHERAL AUDITORY SYSTEM
191 14.4 HAIR CELL AND AUDITORY NERVE FUNCTIONS 192 14.5 PROPERTIES OF
THE AUDITORY NERVE 194 14.6 SUMMARY AND BLOCK DIAGRAM OF THE PERIPHERAL
AUDITORY SYSTEM 201 14.7 EXERCISES 203 CHAPTER 15 PSYCHOACOUSTICS 205
15.1 INTRODUCTION 205 15.2 SOUND-PRESSURE LEVEL AND LOUDNESS 206 15.3
FREQUENCY ANALYSIS AND CRITICAL BANDS 208 15.4 MASKING 210 15.5 SUMMARY
212 15.6 EXERCISES 213 CHAPTER 16 MODELS OF PITCH PERCEPTION 214 16.1
INTRODUCTION 214 16.2 HISTORICAL REVIEW OF PITCH-PERCEPTION MODELS 214
16.3 PHYSIOLOGICAL EXPLORATION OF PLACE VERSUS PERIODICITY 219 16.4
RESULTS FROM PSYCHOACOUSTIC TESTING AND MODELS 220 16.5 SUMMARY 224 16.6
EXERCISES 226 CHAPTER 17 SPEECH PERCEPTION 22 8 17.1 INTRODUCTION 228
17.2 VOWEL PERCEPTION: PSYCHOACOUSTICS AND PHYSIOLOGY 228 17.3 THE
CONFUSION MATRIX 231 XII CONTENTS 17.4 PERCEPTUAL CUES FOR PLOSIVES 234
17.5 PHYSIOLOGICAL STUDIES OF TWO VOICED PLOSIVES 235 17.6 MOTOR
THEORIES OF SPEECH PERCEPTION 237 17.7 NEURAL FIRING PATTERNS FOR
CONNECTED SPEECH STIMULI 239 17.8 CONCLUDING THOUGHTS 240 17.9 EXERCISES
243 CHAPTER 18 HUMAN SPEECH RECOGNITION 246 18.1 18.2 18.3 18.4 18.5
INTRODUCTION 246 THE ARTICULATION INDEX AND HUMAN RECOGNITION 246 18.2.1
THE BIG IDEA 246 18.2.2 THE EXPERIMENTS 247 18.2.3 DISCUSSION 248
COMPARISONS BETWEEN HUMAN AND MACHINE SPEECH RECOGNIZERS 248 CONCLUDING
THOUGHTS 252 EXERCISES 253 PART V »& *! . MS SPEECH FEATURES CHAPTER 19
THE A UDITORY SYSTEM AS A FILTER BANK 257 19.1 INTRODUCTION 257 19.2
REVIEW OF FLETCHER S CRITICAL BAND EXPERIMENTS 257 19.3 RELATION BETWEEN
THRESHOLD MEASUREMENTS AND HYPOTHESIZED FILTER SHAPES 259 19.4
GAMMA-TONE FILTERS, ROEX FILTERS, AND AUDITORY MODELS 264 19.5 OTHER
CONSIDERATIONS IN FILTER-BANK DESIGN 266 19.6 SPEECH SPECTRUM ANALYSIS
USING THE FFT 268 19.7 CONCLUSIONS 269 19.8 EXERCISES 269 CHAPTER 20 THE
CEPSTRUM AS A SPECTRAL ANALYZER 271 20.1 INTRODUCTION 271 20.2 A
HISTORICAL NOTE 271 20.3 THE REAL CEPSTRUM 272 20.4 THE COMPLEX CEPSTRUM
273 20.5 APPLICATION OF CEPSTRAL ANALYSIS TO SPEECH SIGNALS 275 20.6
CONCLUDING THOUGHTS 277 20.7 EXERCISES 278 CONTENTS XUEI CHAPTER 21
LINEAR PREDICTION 280 21.1 INTRODUCTION 280 21.2 THE PREDICTIVE MODEL
280 21.3 PROPERTIES OF THE REPRESENTATION 284 21.4 GETTING THE
COEFFICIENTS 286 21.5 RELATED REPRESENTATIONS 288 21.6 CONCLUDING
DISCUSSION 289 21.7 EXERCISES 291 PART VI CHAPTER 22 FEATURE EXTRACTION
FOR ASR 295 22.1 22.2 22.3 22.4 22.5 22.6 22.7 22.8 INTRODUCTION 295
COMMON FEATURE VECTORS 295 DYNAMIC FEATURES 300 STRATEGIES FOR
ROBUSTNESS 300 22.4.1 ROBUSTNESS TO CONVOLUTIONAL ERROR 300 22.4.2
ROBUSTNESS TO ADDITIVE NOISE 304 22.4.3 CAVEATS 304 AUDITORY MODELS 305
MULTICHANNEL INPUT 305 DISCUSSION 306 EXERCISES 306 CHAPTER 23 UNGUISTIC
CATEGORIES FOR SPEECH RECOGNITION 30 9 23.1 23.2 23.3 23.4 23.5 23.6
23.7 INTRODUCTION 309 PHONES AND PHONEMES 309 23.2.1 OVERVIEW 309 23.2.2
WHAT MAKES A PHONE? 310 23.2.3 WHAT MAKES A PHONEME? 310 PHONETIC AND
PHONEMIC ALPHABETS 311 ARTICULATORY FEATURES 312 23.4.1 OVERVIEW 312
23.4.2 CONSONANTS 312 23.4.3 VOWELS 316 23.4.4 WHY USE FEATURES? 316
SUBWORD UNITS AS CATEGORIES FOR ASR 317 PHONOLOGICAL MODELS FOR ASR 317
CONTEXT-DEPENDENT PHONES 318 XIV CONTENTS 23.8 OTHER SUBWORD UNITS 319
23.8.1 PROPERTIES IN FLUENT SPEECH 320 23.9 PHRASES 320 23.10 SOME
ISSUES IN PHONOLOGICAL MODELING 320 23.11 EXERCISES 321 CHAPTER 24
DETERMINISTIC SEQUENCE RECOGNITION FOR ASR 324 24.1 INTRODUCTION 324
24.2 ISOLATED WORD RECOGNITION 325 24.2.1 LINEAR TIME WARP 326 24.2.2
DYNAMIC TIME WARP 327 24.2.3 DISTANCES 331 24.2.4 END-POINT DETECTION
331 24.3 CONNECTED WORD RECOGNITION 333 24.4 SEGMENTAL APPROACHES 334
24.5 DISCUSSION 335 24.6 EXERCISES 336 CHAPTER 25 STATISTICAL SEQUENCE
RECOGNITION 337 25.1 INTRODUCTION 337 25.2 STATING THE PROBLEM 338 25.3
PARAMETRIZATION AND PROBABILITY ESTIMATION 340 25.3.1 MARKOV MODELS 341
25.3.2 HIDDEN MARKOV MODEL 343 25.3.3 HMMS FOR SPEECH RECOGNITION 344
25.3.4 ESTIMATION OF P(X M) 345 25.4 CONCLUSION 349 25.5 EXERCISES 350
CHAPTER 26 STATISTICAL MODEL TRAINING 351 26.1 INTRODUCTION 351 26.2 HMM
TRAINING 352 26.3 FORWARD-BACKWARD TRAINING 355 26.4 OPTIMAL PARAMETERS
FOR EMISSION PROBABILITY ESTIMATORS 358 26.4.1 GAUSSIAN DENSITY
FUNCTIONS 358 26.4.2 EXAMPLE: TRAINING WITH DISCRETE DENSITIES 359 26.5
VITERBI TRAINING 360 26.5.1 EXAMPLE: TRAINING WITH GAUSSIAN DENSITY
FUNCTIONS 362 26.5.2 EXAMPLE: TRAINING WITH DISCRETE DENSITIES 362 26.6
LOCAL ACOUSTIC PROBABILITY ESTIMATORS FOR ASR 363 26.6.1 DISCRETE
PROBABILITIES 363 26.6.2 GAUSSIAN DENSITIES 363 CONTENTS XV 26.6.3 TIED
MIXTURES OF GAUSSIANS 364 26.6.4 INDEPENDENT MIXTURES OF GAUSSIANS 364
26.6.5 NEURAL NETWORKS 364 26.7 INITIALIZATION 364 26.8 SMOOTHING 365
26.9 CONCLUSION 366 26.10 EXERCISES 366 CHAPTER 27 DISCRIMINANT ACOUSTIC
PROBABIUTY ESTIMATION 367 27.1 INTRODUCTION 367 27.2 DISCRIMINANT
TRAINING 368 27.2.1 MAXIMUM MUTUAL INFORMATION 369 27.2.2 CORRECTIVE
TRAINING 369 27.2.3 GENERALIZED PROBABILISTIC DESCENT 370 27.2.4 DIRECT
ESTIMATION OF POSTERIORS 371 27.3 HMM-ANN BASED ASR 374 27.3.1 MLP
ARCHITECTURE 374 27.3.2 MLP TRAINING 374 27.3.3 EMBEDDED TRAINING 375
27.4 OTHER APPLICATIONS OF ANNS TO ASR 376 27.5 EXERCISES 377 27.6
APPENDIX: POSTERIOR PROBABILITY PROOF 377 CHAPTER 28 SPEECH RECOGNITION
AND UNDERSTANDING 380 28.1 INTRODUCTION 380 28.2 PHONOLOGICAL MODELS 381
28.3 LANGUAGE MODELS 383 28.3.1 N-GRAM STATISTICS 385 28.3.2 SMOOTHING
386 28.4 DECODING WITH ACOUSTIC AND LANGUAGE MODELS 387 28.5 A COMPLETE
SYSTEM 388 28.6 ACCEPTING REALISTIC INPUT 389 28.7 CONCLUDING COMMENTS
391 . . . - . . J V - V . * * V *:*****;*-* : , : * :
SYNTHESIS AND CODING CHAPTER 29 SPEECH SYNTHESIS 395 29.1 INTRODUCTION
395 29.2 PARAMETRIC SOURCE-FILTER SYNTHESIS 396 29.2.1 FORMANT
SYNTHESIZERS 397 XVI CONTENTS 29.2.2 OTHER SOURCE-FILTER SYNTHESIZER
STRUCTURES 39 9 29.2.3 TALKING CHIPS 402 29.3 CONCATENATIVE METHODS 403
29.4 SPECULATION 405 29.5 EXERCISES 406 29.6 APPENDIX: SYNTHESIZER
EXAMPLES 406 29.6.1 THE KLATT RECORDINGS 406 29.6.2 DEVELOPMENT OF
SPEECH SYNTHESIZERS 407 29.6.3 SEGMENTAL SYNTHESIS BY RULE 409 29.7
SYNTHESIS BY RULE OF SEGMENTS AND SENTENCE PROSODY 410 29.8 FULLY
AUTOMATIC TEXT-TO-SPEECH CONVERSION 410 29.8.1 THE VAN SANTEN RECORDINGS
41 1 CHAPTER 30 PITCH DETECTION 415 30.1 INTRODUCTION 415 30.2 A NOTE ON
NOMENCLATURE 415 30.3 PITCH DETECTION PERCEPTION AND ARTICULATION 416
30.4 THE VOICING DECISION 416 30.5 SOME DIFFICULTIES IN PITCH DETECTION
418 30.6 SIGNAL PROCESSING TO IMPROVE PITCH DETECTION 418 30.7
PATTERN-RECOGNITION METHODS FOR PITCH DETECTION 422 30.8 MEDIAN
SMOOTHING TO FIX ERRORS IN PITCH ESTIMATION 42 6 30.9 EXERCISES 428
CHAPTER 31 VOCODERS 431 31.1 INTRODUCTION 43 1 31.2 STANDARDS FOR
DIGITAL SPEECH CODING 431 31.3 DESIGN CONSIDERATIONS IN CHANNEL VOCODER
FILTER BANKS 431 31.4 ENERGY MEASUREMENTS IN A CHANNEL VOCODER 434 31.5
A VOCODER DESIGN FOR SPECTRAL ENVELOPE ESTIMATION 436 31.6 BIT SAVING IN
CHANNEL VOCODERS 436 31.7 DESIGN OF THE EXCITATION PARAMETERS FOR A
CHANNEL VOCODER 440 31.8 LPC VOCODERS 442 31.9 CEPSTRAL VOCODERS 443
31.10 DESIGN COMPARISONS 443 31.11 VOCODER STANDARDIZATION 446 31.12
EXERCISES 447 CHAPTER 32 LOW-RATE VOCODERS 451 32.1 INTRODUCTION 451
32.2 THE FRAME-FILL CONCEPT 452 CONTENTS XVUE 32.3 PATTERN MATCHING OR
VECTOR QUANTIZATION 454 32.4 THE KANG-COULTER 600-BPS VOCODER 455 32.5
SEGMENTATION METHODS FOR BANDWIDTH REDUCTION 456 32.6 EXERCISES 461
CHAPTER 33 MEDIUM-RATE AND HIGH-RATE VOCODERS 463 33.1 33.2 33.3 33.4
33.5 33.6 33.7 33.8 33.9 33.10 33.11 33.12 33.13 INTRODUCTION 463 VOICE
EXCITATION AND SPECTRAL FLATTENING 463 VOICE-EXCITED CHANNEL VOCODER 464
VOICE-EXCITED AND ERROR-SIGNAL-EXCITED LPC VOCODERS WAVEFORM CODING WITH
PREDICTIVE METHODS 468 ADAPTIVE PREDICTIVE CODING OF SPEECH 470 SUBBAND
CODING 471 MULTIPULSE LPC VOCODERS 472 CODE-EXCITED LINEAR PREDICTIVE
CODING 474 33.9.1 MODIFICATIONS TO CELP 476 33.9.2 NON-GAUSSIAN CODEBOOK
SEQUENCES 476 33.9.3 LOW-DELAY CELP 476 REDUCING CODEBOOK SEARCH TIME IN
CELP 478 33.10.1 FILTER SIMPLIFICATION 478 33.10.2 SPEEDING UP THE
SEARCH 479 33.10.3 MULTIRESOLUTION CODEBOOK SEARCH 481 33.10.4 PARTIAL
SEQUENCE ELIMINATION 482 33.10.5 TREE-STRUCTURED DELTA CODEBOOKS 482
33.10.6 ADAPTIVE CODEBOOKS 483 33.10.7 LINEAR COMBINATION CODEBOOKS 484
33.10.8 VECTOR SUM EXCITED LINEAR PREDICTION 485 ADAPTIVE TRANSFORM
CODING 485 CONCLUSIONS 485 EXERCISES 486 466 PART VIII OTHER
APPLICATIONS CHAPTER 34 SPEECH TRANSFORMATIONS 491 34.1 INTRODUCTION 491
34.2 TIME-SCALE MODIFICATION 491 34.3 TRANSFORMATION WITHOUT EXPLICIT
PITCH DETECTION 494 34.4 TRANSFORMATIONS IN ANALYSIS-SYNTHESIS SYSTEMS
495 34.5 HYBRID SYSTEMS 498 34.6 SPEECH MODIFICATION IN PHASE VOCODERS
498 XVIII CONTENTS 34.7 SPEECH TRANSFORMATIONS WITHOUT PITCH EXTRACTION
499 34.7.1 FREQUENCY COMPRESSION AND GENDER TRANSFORMATION 501 34.8 THE
SINE TRANSFORM CODER AS A TRANSFORMATION ALGORITHM 502 34.9 VOICE
MODIFICATION TO EMULATE A TARGET VOICE 504 34.10 EXERCISES 505 CHAPTER
35 SOME ASPECTS OF COMPUTER MUSIC SYNTHESIS 507 35.1 INTRODUCTION 507
35.2 SOME EXAMPLES OF ACOUSTICALLY GENERATED MUSICAL SOUNDS 507 35.3
MUSIC SYNTHESIS CONCEPTS 509 35.4 ANALYSIS-BASED SYNTHESIS 511 35.5
OTHER TECHNIQUES FOR MUSIC SYNTHESIS 514 35.6 REVERBERATION 516 35.7
SEVERAL EXAMPLES OF SYNTHESIS 517 35.8 EXERCISES 519 35.9 ACKNOWLEDGMENT
519 CHAPTER 36 SPEAKER VERIFICATION 521 36.1 INTRODUCTION 521 36.2
ACOUSTIC PARAMETERS 522 36.3 SIMILARITY MEASURES 523 36.4 TEXT-DEPENDENT
SPEAKER VERIFICATION 525 36.5 TEXT-INDEPENDENT SPEAKER VERIFICATION 526
36.6 TEXT-PROMPTED SPEAKER VERIFICATION 527 36.7 IDENTIFICATION,
VERIFICATION, AND THE DECISION THRESHOLD 528 36.8 EXERCISES 529 INDEX
531
|
any_adam_object | 1 |
author | Gold, Ben Morgan, Nelson |
author_facet | Gold, Ben Morgan, Nelson |
author_role | aut aut |
author_sort | Gold, Ben |
author_variant | b g bg n m nm |
building | Verbundindex |
bvnumber | BV013148731 |
callnumber-first | T - Technology |
callnumber-label | TK7882 |
callnumber-raw | TK7882.S65 |
callnumber-search | TK7882.S65 |
callnumber-sort | TK 47882 S65 |
callnumber-subject | TK - Electrical and Nuclear Engineering |
classification_rvk | ZN 6060 |
ctrlnum | (OCoLC)318207673 (DE-599)BVBBV013148731 |
dewey-full | 621.382/2 |
dewey-hundreds | 600 - Technology (Applied sciences) |
dewey-ones | 621 - Applied physics |
dewey-raw | 621.382/2 |
dewey-search | 621.382/2 |
dewey-sort | 3621.382 12 |
dewey-tens | 620 - Engineering and allied operations |
discipline | Elektrotechnik / Elektronik / Nachrichtentechnik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01892nam a2200445 c 4500</leader><controlfield tag="001">BV013148731</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20000515 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">000511s2000 ad|| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0471351547</subfield><subfield code="9">0-471-35154-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)318207673</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV013148731</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield><subfield code="a">DE-703</subfield><subfield code="a">DE-1102</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-83</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK7882.S65</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">621.382/2</subfield><subfield code="2">21</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ZN 6060</subfield><subfield code="0">(DE-625)157500:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Gold, Ben</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Speech and audio signal processing</subfield><subfield code="b">processing and perception of speech and music</subfield><subfield code="c">Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ...</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">New York [u.a.]</subfield><subfield code="b">Wiley</subfield><subfield code="c">2000</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XVIII, 537 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Música electrónica</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Procesamiento de señales - Técnicas digitales</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Sistemas de procesamiento de la voz</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Digitale Sprachverarbeitung</subfield><subfield code="0">(DE-588)4233857-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Sprachproduktion</subfield><subfield code="0">(DE-588)4143703-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Digitale Sprachverarbeitung</subfield><subfield code="0">(DE-588)4233857-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Automatische Sprachproduktion</subfield><subfield code="0">(DE-588)4143703-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="8">1\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Digitale Sprachverarbeitung</subfield><subfield code="0">(DE-588)4233857-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Morgan, Nelson</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">GBV Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=008958038&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-008958038</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection> |
id | DE-604.BV013148731 |
illustrated | Illustrated |
indexdate | 2024-07-09T18:39:51Z |
institution | BVB |
isbn | 0471351547 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-008958038 |
oclc_num | 318207673 |
open_access_boolean | |
owner | DE-29T DE-703 DE-1102 DE-573 DE-634 DE-83 |
owner_facet | DE-29T DE-703 DE-1102 DE-573 DE-634 DE-83 |
physical | XVIII, 537 S. Ill., graph. Darst. |
publishDate | 2000 |
publishDateSearch | 2000 |
publishDateSort | 2000 |
publisher | Wiley |
record_format | marc |
spelling | Gold, Ben Verfasser aut Speech and audio signal processing processing and perception of speech and music Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ... New York [u.a.] Wiley 2000 XVIII, 537 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Música electrónica Procesamiento de señales - Técnicas digitales Sistemas de procesamiento de la voz Digitale Sprachverarbeitung (DE-588)4233857-8 gnd rswk-swf Automatische Sprachproduktion (DE-588)4143703-2 gnd rswk-swf Digitale Sprachverarbeitung (DE-588)4233857-8 s Automatische Sprachproduktion (DE-588)4143703-2 s 1\p DE-604 DE-604 Morgan, Nelson Verfasser aut GBV Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=008958038&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk |
spellingShingle | Gold, Ben Morgan, Nelson Speech and audio signal processing processing and perception of speech and music Música electrónica Procesamiento de señales - Técnicas digitales Sistemas de procesamiento de la voz Digitale Sprachverarbeitung (DE-588)4233857-8 gnd Automatische Sprachproduktion (DE-588)4143703-2 gnd |
subject_GND | (DE-588)4233857-8 (DE-588)4143703-2 |
title | Speech and audio signal processing processing and perception of speech and music |
title_auth | Speech and audio signal processing processing and perception of speech and music |
title_exact_search | Speech and audio signal processing processing and perception of speech and music |
title_full | Speech and audio signal processing processing and perception of speech and music Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ... |
title_fullStr | Speech and audio signal processing processing and perception of speech and music Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ... |
title_full_unstemmed | Speech and audio signal processing processing and perception of speech and music Ben Gold ; Nelson Morgan ; with contributions from Hervé Bourlard ... |
title_short | Speech and audio signal processing |
title_sort | speech and audio signal processing processing and perception of speech and music |
title_sub | processing and perception of speech and music |
topic | Música electrónica Procesamiento de señales - Técnicas digitales Sistemas de procesamiento de la voz Digitale Sprachverarbeitung (DE-588)4233857-8 gnd Automatische Sprachproduktion (DE-588)4143703-2 gnd |
topic_facet | Música electrónica Procesamiento de señales - Técnicas digitales Sistemas de procesamiento de la voz Digitale Sprachverarbeitung Automatische Sprachproduktion |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=008958038&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT goldben speechandaudiosignalprocessingprocessingandperceptionofspeechandmusic AT morgannelson speechandaudiosignalprocessingprocessingandperceptionofspeechandmusic |