Robust methods for content analysis of auditory scenes:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
München
Verl. Dr. Hut
2015
|
Ausgabe: | 1. Aufl. |
Schriftenreihe: | Informationstechnik
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | IX, 172 S. graph. Darst. |
ISBN: | 9783843919869 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV042391552 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 150304s2015 d||| m||| 00||| eng d | ||
020 | |a 9783843919869 |9 978-3-8439-1986-9 | ||
035 | |a (OCoLC)904452916 | ||
035 | |a (DE-599)BVBBV042391552 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-91 |a DE-12 | ||
084 | |a DAT 815d |2 stub | ||
100 | 1 | |a Geiger, Jürgen Thomas |e Verfasser |4 aut | |
245 | 1 | 0 | |a Robust methods for content analysis of auditory scenes |c Jürgen Thomas Geiger |
250 | |a 1. Aufl. | ||
264 | 1 | |a München |b Verl. Dr. Hut |c 2015 | |
300 | |a IX, 172 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Informationstechnik | |
502 | |a Zugl.: Müncehn, Techn. Univ., Diss., 2014 | ||
650 | 0 | 7 | |a Automatische Identifikation |0 (DE-588)4206098-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Robustheit |0 (DE-588)4126481-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Geräuschanalyse |0 (DE-588)4324422-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Inhaltsanalyse |0 (DE-588)4265353-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Gesprochene Sprache |0 (DE-588)4020717-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Sprechererkennung |0 (DE-588)4143704-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Störgeräusch |0 (DE-588)4343358-3 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Nachhall |0 (DE-588)4171018-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Gehen |0 (DE-588)4140871-8 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Geräuschanalyse |0 (DE-588)4324422-1 |D s |
689 | 0 | 1 | |a Gehen |0 (DE-588)4140871-8 |D s |
689 | 0 | 2 | |a Automatische Sprechererkennung |0 (DE-588)4143704-4 |D s |
689 | 0 | 3 | |a Automatische Identifikation |0 (DE-588)4206098-9 |D s |
689 | 0 | 4 | |a Automatische Spracherkennung |0 (DE-588)4003961-4 |D s |
689 | 0 | 5 | |a Gesprochene Sprache |0 (DE-588)4020717-1 |D s |
689 | 0 | 6 | |a Automatische Inhaltsanalyse |0 (DE-588)4265353-8 |D s |
689 | 0 | 7 | |a Störgeräusch |0 (DE-588)4343358-3 |D s |
689 | 0 | 8 | |a Nachhall |0 (DE-588)4171018-6 |D s |
689 | 0 | 9 | |a Robustheit |0 (DE-588)4126481-2 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027827424&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-027827424 |
Datensatz im Suchindex
_version_ | 1804153040913039360 |
---|---|
adam_text | CONTENTS
1 INTRODUCTION 1
1.1 OBJECTIVES 2
1.2 STRUCTURE OF THIS THESIS 5
2 RECOGNITION OF ACOUSTIC SCENES AND EVENTS 7
2.1 ACOUSTIC SCENE CLASSIFICATION 7
2.1.1 INTRODUCTION 8
2.1.2 SYSTEM OVERVIEW 9
2.1.3 FEATURE EXTRACTION 9
2.1.4 WINDOW-BASED CLASSIFICATION 11
2.1.5 LATENT PERCEPTUAL INDEXING 12
2.1.6 EXPERIMENTAL EVALUATION 14
2.1.7 CONCLUSIONS 19
2.2 SUPERVISED LEARNING OF NEW SOUND
EVENTS 21
2.2.1 ACOUSTIC EVENT CLASSIFICATION 22
2.2.2 EXPERIMENTAL EVALUATION 26
2.2.3 CONCLUSIONS 29
2.3 CHAPTER SUMMARY 30
3 ACOUSTIC GAIT-BASED PERSON IDENTIFICATION 31
3.1 INTRODUCTION 31
3.1.1 CONTRIBUTIONS 32
3.1.2 RELATED WORK 33
3.2 THE TUM GAID DATABASE 34
3.3 ACOUSTIC GAIT-BASED
PERSON IDENTIFICATION USING SVM 36
3.3.1 CANDIDATE FEATURES 37
3.3.2 CLASSIFICATION 38
3.3.3 BASELINE RESULTS 38
VII
HTTP://D-NB.INFO/1067206310
CONTENTS
3.3.4 FEATURE ANALYSIS 39
3.3.5 MULTIMODAL FUSION 42
3.3.6 CONCLUSIONS 44
3.4 ACOUSTIC GAIT-BASED
PERSON IDENTIFICATION USING HMMS 44
3.4.1 SYSTEM DESCRIPTION 45
3.4.2 EXPERIMENTAL EVALUATION 47
3.4.3 CONCLUSIONS 49
3.5 CHAPTER SUMMARY 50
4 SPEAKER DIARIZATION 51
4.1 INTRODUCTION 51
4.2 FUNDAMENTALS AND METHODS 53
4.2.1 SPEAKER DIARIZATION METHODS 53
4.2.2 THE DIARIZATION ERROR RATE 55
4.2.3 DATABASES 56
4.2.4 OPEN ISSUES 56
4.3 DETECTION OF OVERLAPPING SPEECH 57
4.3.1 OVERLAPPING SPEECH IN HUMAN CONVERSATIONS 59
4.3.2 RELATED WORK
ON OVERLAP DETECTION AND HANDLING 61
4.3.3 EXPERIMENTAL FRAMEWORK 63
4.3.4 OVERLAP DETECTION USING A SOURCE SEPARATION METHOD 66
4.3.5 AUDIO FEATURES FOR OVERLAP DETECTION 73
4.3.6 OVERLAP DETECTION USING LEXICAL
INFORMATION 78
4.3.7 OVERLAP DETECTION WITH MEMORY-ENHANCED
RECURRENT NEURAL
NETWORKS 85
4.3.8 SUMMARY OF OVERLAP DETECTION RESULTS 90
4.4 OVERLAP HANDLING 92
4.4.1 METHODOLOGY 93
4.4.2 RESULTS AND CONCLUSIONS 93
4.5 ONLINE SPEAKER DIARIZATION 94
4.5.1 METHODOLOGY 96
4.5.2 EXPERIMENTAL EVALUATION 98
4.6 CHAPTER SUMMARY 101
5 ROBUST SPEECH RECOGNITION 103
5.1 INTRODUCTION 103
5.1.1 CONTRIBUTIONS 104
5.1.2 RELATED WORK 105
5.2 LONG SHORT-TERM MEMORY
RECURRENT NEURAL NETWORKS 106
5.3 RECOGNITION IN HIGHLY
NON-STATIONARY NOISE 109
5.3.1 SYSTEM DESCRIPTION 109
5.3.2 THE CHIME CHALLENGE 113
VIII
CONTENTS
5.3.3 EXPERIMENTAL EVALUATION 114
5.3.4 CONCLUSIONS 122
5.4 RECOGNITION IN REVERBERANT ENVIRONMENTS 123
5.4.1 SYSTEM DESCRIPTION 123
5.4.2 THE REVERB CHALLENGE 126
5.4.3 EXPERIMENTAL EVALUATION 127
5.4.4 CONCLUSIONS 129
5.5 CHAPTER SUMMARY 130
6 SUMMARY 131
ACRONYMS 135
MATHEMATICAL SYMBOLS 137
LIST OF FIGURES 143
LIST OF TABLES 145
REFERENCES 147
IX
|
any_adam_object | 1 |
author | Geiger, Jürgen Thomas |
author_facet | Geiger, Jürgen Thomas |
author_role | aut |
author_sort | Geiger, Jürgen Thomas |
author_variant | j t g jt jtg |
building | Verbundindex |
bvnumber | BV042391552 |
classification_tum | DAT 815d |
ctrlnum | (OCoLC)904452916 (DE-599)BVBBV042391552 |
discipline | Informatik |
edition | 1. Aufl. |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02503nam a2200577 c 4500</leader><controlfield tag="001">BV042391552</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">150304s2015 d||| m||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783843919869</subfield><subfield code="9">978-3-8439-1986-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)904452916</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV042391552</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield><subfield code="a">DE-12</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 815d</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Geiger, Jürgen Thomas</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Robust methods for content analysis of auditory scenes</subfield><subfield code="c">Jürgen Thomas Geiger</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. Aufl.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">München</subfield><subfield code="b">Verl. Dr. Hut</subfield><subfield code="c">2015</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">IX, 172 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Informationstechnik</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Zugl.: Müncehn, Techn. Univ., Diss., 2014</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Identifikation</subfield><subfield code="0">(DE-588)4206098-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Robustheit</subfield><subfield code="0">(DE-588)4126481-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Geräuschanalyse</subfield><subfield code="0">(DE-588)4324422-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Inhaltsanalyse</subfield><subfield code="0">(DE-588)4265353-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Gesprochene Sprache</subfield><subfield code="0">(DE-588)4020717-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Sprechererkennung</subfield><subfield code="0">(DE-588)4143704-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Störgeräusch</subfield><subfield code="0">(DE-588)4343358-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Nachhall</subfield><subfield code="0">(DE-588)4171018-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Gehen</subfield><subfield code="0">(DE-588)4140871-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Geräuschanalyse</subfield><subfield code="0">(DE-588)4324422-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Gehen</subfield><subfield code="0">(DE-588)4140871-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Automatische Sprechererkennung</subfield><subfield code="0">(DE-588)4143704-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Automatische Identifikation</subfield><subfield code="0">(DE-588)4206098-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="4"><subfield code="a">Automatische Spracherkennung</subfield><subfield code="0">(DE-588)4003961-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="5"><subfield code="a">Gesprochene Sprache</subfield><subfield code="0">(DE-588)4020717-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="6"><subfield code="a">Automatische Inhaltsanalyse</subfield><subfield code="0">(DE-588)4265353-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="7"><subfield code="a">Störgeräusch</subfield><subfield code="0">(DE-588)4343358-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="8"><subfield code="a">Nachhall</subfield><subfield code="0">(DE-588)4171018-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="9"><subfield code="a">Robustheit</subfield><subfield code="0">(DE-588)4126481-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027827424&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027827424</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV042391552 |
illustrated | Illustrated |
indexdate | 2024-07-10T01:20:17Z |
institution | BVB |
isbn | 9783843919869 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027827424 |
oclc_num | 904452916 |
open_access_boolean | |
owner | DE-91 DE-BY-TUM DE-12 |
owner_facet | DE-91 DE-BY-TUM DE-12 |
physical | IX, 172 S. graph. Darst. |
publishDate | 2015 |
publishDateSearch | 2015 |
publishDateSort | 2015 |
publisher | Verl. Dr. Hut |
record_format | marc |
series2 | Informationstechnik |
spelling | Geiger, Jürgen Thomas Verfasser aut Robust methods for content analysis of auditory scenes Jürgen Thomas Geiger 1. Aufl. München Verl. Dr. Hut 2015 IX, 172 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Informationstechnik Zugl.: Müncehn, Techn. Univ., Diss., 2014 Automatische Identifikation (DE-588)4206098-9 gnd rswk-swf Robustheit (DE-588)4126481-2 gnd rswk-swf Geräuschanalyse (DE-588)4324422-1 gnd rswk-swf Automatische Inhaltsanalyse (DE-588)4265353-8 gnd rswk-swf Gesprochene Sprache (DE-588)4020717-1 gnd rswk-swf Automatische Sprechererkennung (DE-588)4143704-4 gnd rswk-swf Störgeräusch (DE-588)4343358-3 gnd rswk-swf Nachhall (DE-588)4171018-6 gnd rswk-swf Automatische Spracherkennung (DE-588)4003961-4 gnd rswk-swf Gehen (DE-588)4140871-8 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Geräuschanalyse (DE-588)4324422-1 s Gehen (DE-588)4140871-8 s Automatische Sprechererkennung (DE-588)4143704-4 s Automatische Identifikation (DE-588)4206098-9 s Automatische Spracherkennung (DE-588)4003961-4 s Gesprochene Sprache (DE-588)4020717-1 s Automatische Inhaltsanalyse (DE-588)4265353-8 s Störgeräusch (DE-588)4343358-3 s Nachhall (DE-588)4171018-6 s Robustheit (DE-588)4126481-2 s DE-604 DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027827424&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Geiger, Jürgen Thomas Robust methods for content analysis of auditory scenes Automatische Identifikation (DE-588)4206098-9 gnd Robustheit (DE-588)4126481-2 gnd Geräuschanalyse (DE-588)4324422-1 gnd Automatische Inhaltsanalyse (DE-588)4265353-8 gnd Gesprochene Sprache (DE-588)4020717-1 gnd Automatische Sprechererkennung (DE-588)4143704-4 gnd Störgeräusch (DE-588)4343358-3 gnd Nachhall (DE-588)4171018-6 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd Gehen (DE-588)4140871-8 gnd |
subject_GND | (DE-588)4206098-9 (DE-588)4126481-2 (DE-588)4324422-1 (DE-588)4265353-8 (DE-588)4020717-1 (DE-588)4143704-4 (DE-588)4343358-3 (DE-588)4171018-6 (DE-588)4003961-4 (DE-588)4140871-8 (DE-588)4113937-9 |
title | Robust methods for content analysis of auditory scenes |
title_auth | Robust methods for content analysis of auditory scenes |
title_exact_search | Robust methods for content analysis of auditory scenes |
title_full | Robust methods for content analysis of auditory scenes Jürgen Thomas Geiger |
title_fullStr | Robust methods for content analysis of auditory scenes Jürgen Thomas Geiger |
title_full_unstemmed | Robust methods for content analysis of auditory scenes Jürgen Thomas Geiger |
title_short | Robust methods for content analysis of auditory scenes |
title_sort | robust methods for content analysis of auditory scenes |
topic | Automatische Identifikation (DE-588)4206098-9 gnd Robustheit (DE-588)4126481-2 gnd Geräuschanalyse (DE-588)4324422-1 gnd Automatische Inhaltsanalyse (DE-588)4265353-8 gnd Gesprochene Sprache (DE-588)4020717-1 gnd Automatische Sprechererkennung (DE-588)4143704-4 gnd Störgeräusch (DE-588)4343358-3 gnd Nachhall (DE-588)4171018-6 gnd Automatische Spracherkennung (DE-588)4003961-4 gnd Gehen (DE-588)4140871-8 gnd |
topic_facet | Automatische Identifikation Robustheit Geräuschanalyse Automatische Inhaltsanalyse Gesprochene Sprache Automatische Sprechererkennung Störgeräusch Nachhall Automatische Spracherkennung Gehen Hochschulschrift |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027827424&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT geigerjurgenthomas robustmethodsforcontentanalysisofauditoryscenes |