Verfügbarkeit: Audiovisual speech processing

Audiovisual speech processing:

"When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Weitere Verfasser:	Bailly, Gérard (HerausgeberIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Cambridge [u.a.] Cambridge Univ. Press 2012
Ausgabe:	1. publ.
Schlagworte:	Speech Perception Lipreading Phonetics Speech / physiology Visual Perception Sprachverarbeitung
Online-Zugang:	Inhaltsverzeichnis Klappentext
Zusammenfassung:	"When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech"--Provided by publisher
Beschreibung:	Literaturverz. S. 403 - 468 Hier auch später erschienene, unveränderte Nachdrucke
Beschreibung:	XXXVI, 470 S. Ill., graph. Darst.
ISBN:	9781107006829 9781107499324 1107006821

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV039972387
003	DE-604
005	20151123
007	t
008	120321s2012 ad\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 9781107006829 \|c hbk. \|9 978-1-107-00682-9
020			\|a 9781107499324 \|c pbk. \|9 978-1-107-49932-4
020			\|a 1107006821 \|9 1-107-00682-1
035			\|a (OCoLC)785855660
035			\|a (DE-599)BVBBV039972387
040			\|a DE-604 \|b ger \|e rakwb
041	0		\|a eng
049			\|a DE-12 \|a DE-739 \|a DE-188 \|a DE-11
084			\|a ES 950 \|0 (DE-625)27936: \|2 rvk
084			\|a ET 215 \|0 (DE-625)27955: \|2 rvk
084			\|a ET 220 \|0 (DE-625)27956: \|2 rvk
245	1	0	\|a Audiovisual speech processing \|c ed. by Gérard Bailly ...
250			\|a 1. publ.
264		1	\|a Cambridge [u.a.] \|b Cambridge Univ. Press \|c 2012
300			\|a XXXVI, 470 S. \|b Ill., graph. Darst.
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
500			\|a Literaturverz. S. 403 - 468
500			\|a Hier auch später erschienene, unveränderte Nachdrucke
520			\|a "When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech"--Provided by publisher
650		4	\|a Speech Perception
650		4	\|a Lipreading
650		4	\|a Phonetics
650		4	\|a Speech / physiology
650		4	\|a Visual Perception
650	0	7	\|a Sprachverarbeitung \|0 (DE-588)4116579-2 \|2 gnd \|9 rswk-swf
689	0	0	\|a Sprachverarbeitung \|0 (DE-588)4116579-2 \|D s
689	0		\|8 1\p \|5 DE-604
700	1		\|a Bailly, Gérard \|4 edt
856	4	2	\|m HBZ Datenaustausch \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
856	4	2	\|m Digitalisierung UB Passau \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA \|3 Klappentext
999			\|a oai:aleph.bib-bvb.de:BVB01-024829896
883	1		\|8 1\p \|a cgwrk \|d 20201028 \|q DE-101 \|u https://d-nb.info/provenance/plan#cgwrk

Datensatz im Suchindex

_version_	1804148951535845376
adam_text	Titel: Audiovisual speech processing Autor: Bailly, Gérard Jahr: 2012 Contents List of figures page xi List of tables xvii List of contributors xviii Preface xxxiii Acknowledgments xxxvi Introduction 1 1 Three puzzles of multimodal speech perception 4 R. E. REMEZ 1.1 Introduction 4 1.2 Organization 5 1.3 Event perception and speech perception 10 1.4 Experience 15 1.5 A conclusion 20 1.6 Acknowledgments 20 2 Visual speech perception 21 L.E. BERNSTEIN 2.1 Introduction 21 2.2 Evaluation of visemes and word homopheny 27 2.3 Phonetic distinctiveness of English words 32 2.4 Research strategies 36 2.5 General conclusions 39 2.6 Acknowledgments 39 3 Dynamic information for face perception 40 K. LANDER AND V. BRUCE 3.1 Introduction 40 3.2 Motion information for expression perception 42 3.3 Motion information for visual speech perception 44 3.4 Dynamic information for familiar face recognition 47 3.5 Dynamic information for unfamiliar face learning 51 3.6 Practical considerations 54 3.7 Theoretical interpretations 55 3.8 Future research and conclusions 60 viii Contents 4 Investigating auditory-visual speech perception development 62 D. BURNHAM AND K. SEKIYAMA 4.1 Speech perception is auditory-visual 62 4.2 Auditory-visual speech perception 63 4.3 Methods for investigating development 64 4.4 The ontogenetic development method 65 4.5 The cross-language development method 69 4.6 Combined methods 71 4.7 Conclusions and an application: automatic speech recognition 73 4.8 Acknowledgments 75 5 Brain bases for seeing speech: fMRI studies of speechreading 76 R. CAMPBELL AND M. MACSWEENEY 5.1 Introduction 76 5.2 Route maps and guidelines 77 5.3 Silent speechreading and auditory cortex 83 5.4 Audiovisual integration: timing 92 5.5 Speechreading: other cortical regions 94 5.6 Speechreading in people born deaf 95 5.7 Conclusions, directions 98 5.8 Acknowledgments 99 5.9 Appendix: glossary of acronyms and terms 100 6 Temporal organization of Cued Speech production 104 D. BEAUTEMPS, M.-A. CATHIARD, V. ATTINA, AND C. SAVARIAUX 6.1 Introduction 104 6.2 Overview on manual cueing 105 6.3 First results on Cued Speech production 110 6.4 General discussion 118 6.5 Acknowledgments 120 7 Bimodal perception within the natural time-course of speech production 121 M.-A. CATHIARD, A. VILAIN, R. LABOISSIERE, H. LOEVENBRUCK, C. SAVARIAUX, AND J.-L. SCHWARTZ 7.1 Introduction 121 7.2 The 2-Component-Vowel model 123 7.3 The 2-Comp-Vowel model and visible speech 135 7.4 The perceptual benefit of the model 146 7.5 Conclusion and perspectives 155 7.6 Post-scriptum 158 7.7 Acknowledgments 158 8 Visual and audiovisual synthesis and recognition of speech by computers 159 N. M. BROOKE AND S. D. SCOTT 8.1 Overview 159 Contents ix 8.2 The historical perspective 161 8.3 Heads, faces, and visible speech signals 168 8.4 Automatic audiovisual speech processing 175 8.5 Assessing and perceiving audiovisual speech 184 8.6 Current prospects 189 9 Audiovisual automatic speech recognition 193 G. POTAMIANOS, C. NETI, J. LUETTIN, AND I. MATTHEWS 9.1 Introduction 193 9.2 Visual front ends 197 9.3 Audiovisual integration 213 9.4 Audiovisual databases 229 9.5 Audiovisual ASR experiments 234 9.6 Summary and discussion 244 9.7 Acknowledgments 247 10 Image-based facial synthesis 248 M. SLANEY AND C. BREGLER 10.1 Facial synthesis approaches 248 10.2 Image-based facial synthesis 250 10.3 Analyses and normalization 253 10.4 Synthesis 259 10.5 Alternative approaches 265 10.6 Conclusions 270 10.7 Acknowledgments 270 11 A trainable videorealistic speech animation system 271 T. EZZAT, G. GEIGER, AND T. POGGIO 11.1 Overview 271 11.2 Background 272 11.3 System overview 275 11.4 Corpus 276 11.5 Pre-processing 277 11.6 Multidimensional morphable models 277 11.7 Trajectory synthesis 287 11.8 Post-processing 291 11.9 Computational issues 292 11.10 Evaluation 293 11.11 Further work 305 11.12 Acknowledgments 305 11.13 Appendix 306 12 Animated speech: research progress and applications 309 D. W. MASSARO, M. M. COHEN, M. TABAIN, J. BESKOW, AND R. CLARK 12.1 Background 309 12.2 Visible speech synthesis 311 12.3 Illustrative experiment of evaluation testing 314 12.4 The use of synthetic speech and facial animation 317 x Contents 12.5 New structures and their control 319 12.6 Reshaping the canonical head 328 12.7 Training speech articulation using dynamic 3D measurements 330 12.8 Some applications of electropalatography to speech therapy 333 12.9 Development of a speech tutor 336 12.10 Empirical studies 341 12.11 Additional potential applications 344 12.12 Acknowledgments 345 13 Empirical perceptual-motor linkage of multimodal speech 346 E. VATIKIOTIS-BATESON AND K. G. MUNHALL 13.1 Introduction 346 13.2 The perception of audiovisual speech 347 13.3 Bringing speech production to the face 349 13.4 Auditory-visual speech production 349 13.5 Correspondences of multimodal speech 350 13.6 Talking head animation 355 13.7 The importance of physical structure 356 13.8 Communicative versus cosmetic realism 364 13.9 Summary 366 13.10 Acknowledgments 367 14 Sensorimotor characteristics of speech production 368 G. BAILLY, P. BADIN, L. REVERET, AND A. BEN YOUSSEF 14.1 Introduction 368 14.2 Speech maps 368 14.3 Degrees-of-freedom in a speech task 369 14.4 Models of the underlying speech organs 372 14.5 Models of facial deformation 377 14.6 Linking articulatory degrees-of-freedom 384 14.7 Discussion 393 14.8 Conclusions 395 14.9 Acknowledgments 396 Notes 397 References 403 Index 469 Audiovisual Speech Processing When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audio visual performance: how auditory and visual signals combine to access the mental lexicon, and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech, and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech.
any_adam_object	1
author2	Bailly, Gérard
author2_role	edt
author2_variant	g b gb
author_facet	Bailly, Gérard
building	Verbundindex
bvnumber	BV039972387
classification_rvk	ES 950 ET 215 ET 220
ctrlnum	(OCoLC)785855660 (DE-599)BVBBV039972387
discipline	Sprachwissenschaft Literaturwissenschaft
edition	1. publ.
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03012nam a2200493 c 4500</leader><controlfield tag="001">BV039972387</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20151123 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">120321s2012 ad\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781107006829</subfield><subfield code="c">hbk.</subfield><subfield code="9">978-1-107-00682-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781107499324</subfield><subfield code="c">pbk.</subfield><subfield code="9">978-1-107-49932-4</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1107006821</subfield><subfield code="9">1-107-00682-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)785855660</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV039972387</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-11</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 950</subfield><subfield code="0">(DE-625)27936:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ET 215</subfield><subfield code="0">(DE-625)27955:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ET 220</subfield><subfield code="0">(DE-625)27956:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Audiovisual speech processing</subfield><subfield code="c">ed. by Gérard Bailly ...</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. publ.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge [u.a.]</subfield><subfield code="b">Cambridge Univ. Press</subfield><subfield code="c">2012</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXXVI, 470 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturverz. S. 403 - 468</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Hier auch später erschienene, unveränderte Nachdrucke</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">"When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech"--Provided by publisher</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech Perception</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Lipreading</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Phonetics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Speech / physiology</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Visual Perception</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="8">1\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bailly, Gérard</subfield><subfield code="4">edt</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-024829896</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection>
id	DE-604.BV039972387
illustrated	Illustrated
indexdate	2024-07-10T00:15:18Z
institution	BVB
isbn	9781107006829 9781107499324 1107006821
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-024829896
oclc_num	785855660
open_access_boolean
owner	DE-12 DE-739 DE-188 DE-11
owner_facet	DE-12 DE-739 DE-188 DE-11
physical	XXXVI, 470 S. Ill., graph. Darst.
publishDate	2012
publishDateSearch	2012
publishDateSort	2012
publisher	Cambridge Univ. Press
record_format	marc
spelling	Audiovisual speech processing ed. by Gérard Bailly ... 1. publ. Cambridge [u.a.] Cambridge Univ. Press 2012 XXXVI, 470 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Literaturverz. S. 403 - 468 Hier auch später erschienene, unveränderte Nachdrucke "When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech"--Provided by publisher Speech Perception Lipreading Phonetics Speech / physiology Visual Perception Sprachverarbeitung (DE-588)4116579-2 gnd rswk-swf Sprachverarbeitung (DE-588)4116579-2 s 1\p DE-604 Bailly, Gérard edt HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Passau application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk
spellingShingle	Audiovisual speech processing Speech Perception Lipreading Phonetics Speech / physiology Visual Perception Sprachverarbeitung (DE-588)4116579-2 gnd
subject_GND	(DE-588)4116579-2
title	Audiovisual speech processing
title_auth	Audiovisual speech processing
title_exact_search	Audiovisual speech processing
title_full	Audiovisual speech processing ed. by Gérard Bailly ...
title_fullStr	Audiovisual speech processing ed. by Gérard Bailly ...
title_full_unstemmed	Audiovisual speech processing ed. by Gérard Bailly ...
title_short	Audiovisual speech processing
title_sort	audiovisual speech processing
topic	Speech Perception Lipreading Phonetics Speech / physiology Visual Perception Sprachverarbeitung (DE-588)4116579-2 gnd
topic_facet	Speech Perception Lipreading Phonetics Speech / physiology Visual Perception Sprachverarbeitung
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=024829896&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT baillygerard audiovisualspeechprocessing

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge