Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
2013
|
Schlagworte: | |
Online-Zugang: | Volltext http://d-nb.info/1048522377/34 kostenfrei Inhaltsverzeichnis |
Beschreibung: | XVIII, 262 S. Ill., graph. Darst., Kt. |
Format: | Langzeitarchivierung gewährleistet, LZA |
Internformat
MARC
LEADER | 00000nam a2200000zc 4500 | ||
---|---|---|---|
001 | BV041802500 | ||
003 | DE-604 | ||
005 | 20140604 | ||
007 | t | ||
008 | 140415s2013 gw abd| m||| 00||| eng d | ||
015 | |a 14,O04 |2 dnb | ||
016 | 7 | |a 1048522377 |2 DE-101 | |
035 | |a (OCoLC)875596112 | ||
035 | |a (DE-599)DNB1048522377 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
044 | |a gw |c XA-DE-BY | ||
049 | |a DE-12 |a DE-384 |a DE-473 |a DE-703 |a DE-1051 |a DE-824 |a DE-29 |a DE-91 |a DE-19 |a DE-1049 |a DE-92 |a DE-739 |a DE-898 |a DE-355 |a DE-706 |a DE-20 |a DE-1102 | ||
082 | 0 | |a 370.949348 |2 22/ger | |
084 | |a 510 |2 sdnb | ||
100 | 1 | |a Schubert, Erich |e Verfasser |4 aut | |
245 | 1 | 0 | |a Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |c Erich Schubert |
264 | 1 | |c 2013 | |
300 | |a XVIII, 262 S. |b Ill., graph. Darst., Kt. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
502 | |a München, Univ., Diss., 2013 | ||
538 | |a Langzeitarchivierung gewährleistet, LZA | ||
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |o urn:nbn:de:bvb:19-166938 |
856 | 4 | 0 | |u https://nbn-resolving.org/urn:nbn:de:bvb:19-166938 |x Resolving-System |z kostenfrei |3 Volltext |
856 | 4 | 0 | |u http://d-nb.info/1048522377/34 |x Langzeitarchivierung Nationalbibliothek |
856 | 4 | 0 | |u http://edoc.ub.uni-muenchen.de/16693/ |x Verlag |z kostenfrei |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027248002&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
912 | |a ebook | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-027248002 |
Datensatz im Suchindex
_version_ | 1804152123415330816 |
---|---|
adam_text | CONTENTS OVERVIEW
ABSTRACT VII
PREFACE X
LIST OF FIGURES XIX
LIST OF TABLES XXII
LIST OF ALGORITHMS XXIV
NOMENCLATURE XXVI
1 INTRODUCTION 1
2 PRELIMINARIES 9
3 RELATED WORK 19
4 HIGH-DIMENSIONAL DATA AND THE CURSE OF DIMENSIONALITY 27
5 IMPROVING LOCAL OUTLIER DETECTION 57
6 GENERALIZATION AND MODULARIZATION 107
7 ENSEMBLE METHODS 149
8 SCALABILITY 173
9 CUSTOMIZATION CASE STUDIES 213
10 CONCLUSIONS AND OUTLOOK 225
BIBLIOGRAPHY 229
APPENDIX 253
HTTP://D-NB.INFO/1050235398
CONTENTS
ABSTRACT VII
PREFACE X
LIST OF FIGURES XIX
LIST OF TABLES XXII
LIST OF ALGORITHMS XXIV
NOMENCLATURE XXVI
1 INTRODUCTION 1
1.1 OUTLIER DEFINITIONS 1
1.2 OUTLIERS IN GAUSSIAN DISTRIBUTIONS 4
1.3 WHISKER PLOTS 7
1.4 DEALING WITH OUTLIERS 8
2 PRELIMINARIES 9
2.1 AVERAGES AND GENERALIZED MEANS 9
2.1.1 PYTHAGOREAN MEANS 9
2.1.2 GENERALIZED MEANS 10
2.1.3 MEDIAN, MODE AND TRIMMED MEANS 11
2.1.4 CHOOSING THE APPROPRIATE MEAN 11
2.2 DISTANCE FUNCTIONS, METRICS AND NORMS 12
2.2.1 MINKOWSKI L
P
-NORMS 14
2.2.2 DISTANCE FUNCTIONS INDUCED BY MEANS 15
2.2.3 WEIGHTED DISTANCE FUNCTIONS 15
2.3 KERNEL DENSITY ESTIMATION 16
2.3.1 POPULAR KERNEL FUNCTIONS 18
2.3.2 MULTIDIMENSIONAL KERNEL DENSITY ESTIMATION 18
3 RELATED WORK 19
3.1 DISTANCE-BASED OUTLIER DETECTION IN DATABASES 19
3.2 LOCAL OUTLIER DETECTION 21
3.3 BASIC LOF VARIATIONS 22
XVI CONTENTS
3.4 LOCAL OUTLIER DETECTION IN HIGH-DIMENSIONAL DATA 22
3.5 EFFICIENT APPROACHES FOR LOCAL OUTLIER DETECTION 24
3.6 ENSEMBLE METHODS FOR OUTLIER DETECTION 25
3.7 COMMUNITY OUTLIER DETECTION 25
3.8 GEOGRAPHIC OUTLIER DETECTION 26
4 HIGH-DIMENSIONAL DATA AND THE CURSE OF DIMENSIONALITY 27
4.1 MANIFESTATIONS OF THE CURSE OF DIMENSIONALITY 28
4.2 EMPIRICAL OBSERVATIONS ON HIGH-DIMENSIONAL DATA 30
4.3 INTRINSIC DIMENSIONALITY 34
4.4 SHARED NEAREST NEIGHBORS (SNN) 35
4.5 EMPIRICAL OBSERVATIONS ON SNN SIMILARITY 37
4.5.1 DATA SETS 38
4.5.2 DISTANCE MEASURES 41
4.5.3 EVALUATION CRITERIA 42
4.5.4 EXPERIMENTAL RESULTS 43
4.6 VISUALIZATION OF HIGH-DIMENSIONAL DATA 49
4.6.1 RELATED WORK 49
4.6.2 ARRANGING DIMENSIONS 51
4.6.3 VISUALIZATION EXAMPLES 54
5 IMPROVING LOCAL OUTLIER DETECTION 57
5.1 IMPROVING THE ROBUSTNESS OF THE LOCAL OUTLIER FACTOR 57
5.1.1 THE STABILIZATION EFFECT OF THE REACHABILITY DISTANCE 58
5.1.2 STABILIZATION WITH DIFFERENT AVERAGES 60
5.1.3 LOCAL OUTLIERS USING KERNEL DENSITY ESTIMATION 61
5.2 LOCAL OUTLIER DETECTION IN HIGHER DIMENSIONS 64
5.2.1 DETECTING LOCAL OUTLIERS IN AXIS PARALLEL SUBSPACES: SOD 64
5.2.2 DETECTING OUTLIERS IN ARBITRARILY ORIENTED SUBSPACES: COP 66
5.3 INTERPRETING AND UNIFYING OUTLIER SCORES 70
5.3.1 INTERPRETING OUTLIER SCORES 71
5.3.2 UNIFYING OUTLIER SCORES 73
5.4 EVALUATION OF OUTLIER SCORES 92
5.4.1 EVALUATION BY PRECISION@FC AND ROC CURVES 92
5.4.2 CALIBRATION OF OUTLIER SCORES 95
5.4.3 EVALUATION BY DISTANCE MEASURES IN THE VECTOR SPACE OF SCORES 97
6 GENERALIZATION AND MODULARIZATION 107
6.1 DIFFERENT NOTIONS OF LOCALITY 108
6.2 FORMALIZED ANALYSIS OF OUTLIER DETECTION MODELS 113
6.2.1 GENERALIZED OUTLIER DETECTION MODEL FRAMEWORK 113
6.2.2 FUNDAMENTAL CONTEXT FUNCTIONS 115
6.2.3 FUNDAMENTAL MODEL FUNCTIONS 117
CONTENTS XVII
6.2.4 CASE STUDY: VARIANTS OF LOF 119
6.2.5 CASE STUDY: PLOT MODELS AS USED BY LOCI 121
6.2.6 DEPENDENCY GRAPH AND ORDER OF LOCALITY 123
6.3 LOCALITY AND SPATIAL OUTLIERS 126
6.3.1 DATA 127
6.3.2 NEIGHBORHOOD IN SPATIAL OUTLIER DETECTION 127
6.3.3 MODELS IN SPATIAL OUTLIER DETECTION 128
6.3.4 EXPERIMENTAL COMPARISON OF SPATIAL OUTLIER SCORES 131
6.3.5 SLOM AS A SPECIAL CASE OF LOCAL OUTLIER 133
6.3.6 UNIVARIATE VS. MULTIVARIATE OUTLIER ANALYSIS 135
6.4 LOCALITY IN VIDEO STREAMS 135
6.5 LOCALITY IN NETWORK OUTLIERS 137
7 ENSEMBLE METHODS 149
7.1 BACKGROUND 149
7.2 COMPONENTS OF AN ENSEMBLE METHOD 150
7.2.1 SCORE NORMALIZATION 151
7.2.2 SCORE COMBINATION 151
7.2.3 DIVERSITY SOURCES 153
7.2.4 ENSEMBLE CONSTRUCTION AND PRUNING 158
7.3 GREEDY ENSEMBLE CONSTRUCTION 159
7.4 EXPERIMENTS 161
7.4.1 COMBINATIONS OF DIFFERENT ALGORITHMS AND PARAMETERS 161
7.4.2 DIFFERENT ENSEMBLE COMBINATION RULES 166
7.4.3 SUBSAMPLING ENSEMBLES 168
8 SCALABILITY 173
8.1 RUNTIME COST OF OUTLIER DETECTION 174
8.2 APPROXIMATE NEAREST NEIGHBORS IN DENSE HIGH-DIMENSIONAL DATA 176
8.2.1 DIMENSIONALITY REDUCTION BY FEATURE SELECTION 176
8.2.2 DIMENSIONALITY REDUCTION BY RANDOM PROJECTIONS 177
8.2.3 DIMENSIONALITY REDUCTION BY SPACE-FILLING CURVES 179
8.2.4 FAST APPROXIMATE FCNNSEARCH WITH SPACE-FILLING CURVES 183
8.2.5 DIFFERENT KINDS OF APPROXIMATIONS 185
8.2.6 EVALUATION 186
8.3 INDEXING GEODETIC DATA 190
8.3.1 INTRODUCTION 190
8.3.2 RELATED WORK 194
8.3.3 INDEXING GEODETIC DATA 197
8.3.4 EXPERIMENTS 203
8.3.5 CONCLUSIONS 208
XVIII CONTENTS
9 CUSTOMIZATION CASE STUDIES 213
9.1 CASE STUDY: KERNEL DENSITY ESTIMATION OUTLIER DETECTION (KDEOS) 213
9.1.1 DENSITY ESTIMATION STEP 214
9.1.2 DENSITY COMPARISON STEP 215
9.1.3 SCORE NORMALIZATION STEP 215
9.1.4 ALGORITHM AND COMPLEXITY 216
9.1.5 EXPERIMENTAL RESULTS 217
9.2 CUSTOMIZATION CASE STUDY: ROAD ACCIDENTS BLACKSPOTS 218
9.2.1 EXPERIMENTAL SETUP 218
9.2.2 RESULTS 219
9.3 CUSTOMIZATION CASE STUDY: RADIATION MEASUREMENTS 222
9.3.1 EXPERIMENTAL SETUP 222
9.3.2 RESULTS 223
10 CONCLUSIONS AND OUTLOOK 225
BIBLIOGRAPHY 229
APPENDIX 253
1 WEIGHTED PEARSON CORRELATION COEFFICIENT 253
1.1 WEIGHTED COVARIANCE 254
1.2 NUMERICAL INSTABILITIES 254
1.3 A NUMERICALLY STABLE ON-LINE ALGORITHM 255
2 OVERVIEW OF PRIOR PUBLISHED PARTS 256
|
any_adam_object | 1 |
author | Schubert, Erich |
author_facet | Schubert, Erich |
author_role | aut |
author_sort | Schubert, Erich |
author_variant | e s es |
building | Verbundindex |
bvnumber | BV041802500 |
collection | ebook |
ctrlnum | (OCoLC)875596112 (DE-599)DNB1048522377 |
dewey-full | 370.949348 |
dewey-hundreds | 300 - Social sciences |
dewey-ones | 370 - Education |
dewey-raw | 370.949348 |
dewey-search | 370.949348 |
dewey-sort | 3370.949348 |
dewey-tens | 370 - Education |
discipline | Pädagogik Mathematik |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01774nam a2200409zc 4500</leader><controlfield tag="001">BV041802500</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20140604 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">140415s2013 gw abd| m||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">14,O04</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">1048522377</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)875596112</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB1048522377</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">gw</subfield><subfield code="c">XA-DE-BY</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-703</subfield><subfield code="a">DE-1051</subfield><subfield code="a">DE-824</subfield><subfield code="a">DE-29</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-1049</subfield><subfield code="a">DE-92</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-898</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-706</subfield><subfield code="a">DE-20</subfield><subfield code="a">DE-1102</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">370.949348</subfield><subfield code="2">22/ger</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">510</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Schubert, Erich</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining</subfield><subfield code="c">Erich Schubert</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XVIII, 262 S.</subfield><subfield code="b">Ill., graph. Darst., Kt.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">München, Univ., Diss., 2013</subfield></datafield><datafield tag="538" ind1=" " ind2=" "><subfield code="a">Langzeitarchivierung gewährleistet, LZA</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="o">urn:nbn:de:bvb:19-166938</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://nbn-resolving.org/urn:nbn:de:bvb:19-166938</subfield><subfield code="x">Resolving-System</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">http://d-nb.info/1048522377/34</subfield><subfield code="x">Langzeitarchivierung Nationalbibliothek</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">http://edoc.ub.uni-muenchen.de/16693/</subfield><subfield code="x">Verlag</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027248002&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ebook</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027248002</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV041802500 |
illustrated | Illustrated |
indexdate | 2024-07-10T01:05:42Z |
institution | BVB |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027248002 |
oclc_num | 875596112 |
open_access_boolean | 1 |
owner | DE-12 DE-384 DE-473 DE-BY-UBG DE-703 DE-1051 DE-824 DE-29 DE-91 DE-BY-TUM DE-19 DE-BY-UBM DE-1049 DE-92 DE-739 DE-898 DE-BY-UBR DE-355 DE-BY-UBR DE-706 DE-20 DE-1102 |
owner_facet | DE-12 DE-384 DE-473 DE-BY-UBG DE-703 DE-1051 DE-824 DE-29 DE-91 DE-BY-TUM DE-19 DE-BY-UBM DE-1049 DE-92 DE-739 DE-898 DE-BY-UBR DE-355 DE-BY-UBR DE-706 DE-20 DE-1102 |
physical | XVIII, 262 S. Ill., graph. Darst., Kt. |
psigel | ebook |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
record_format | marc |
spelling | Schubert, Erich Verfasser aut Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining Erich Schubert 2013 XVIII, 262 S. Ill., graph. Darst., Kt. txt rdacontent n rdamedia nc rdacarrier München, Univ., Diss., 2013 Langzeitarchivierung gewährleistet, LZA (DE-588)4113937-9 Hochschulschrift gnd-content Erscheint auch als Online-Ausgabe urn:nbn:de:bvb:19-166938 https://nbn-resolving.org/urn:nbn:de:bvb:19-166938 Resolving-System kostenfrei Volltext http://d-nb.info/1048522377/34 Langzeitarchivierung Nationalbibliothek http://edoc.ub.uni-muenchen.de/16693/ Verlag kostenfrei DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027248002&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Schubert, Erich Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |
subject_GND | (DE-588)4113937-9 |
title | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |
title_auth | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |
title_exact_search | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |
title_full | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining Erich Schubert |
title_fullStr | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining Erich Schubert |
title_full_unstemmed | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining Erich Schubert |
title_short | Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining |
title_sort | generalized and efficient outlier detection for spatial temporal and high dimensional data mining |
topic_facet | Hochschulschrift |
url | https://nbn-resolving.org/urn:nbn:de:bvb:19-166938 http://d-nb.info/1048522377/34 http://edoc.ub.uni-muenchen.de/16693/ http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027248002&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT schuberterich generalizedandefficientoutlierdetectionforspatialtemporalandhighdimensionaldatamining |