Applied data mining:
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Boca Raton [u.a.]
CRC Press
2013
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XI, 272 S. graph. Darst. |
ISBN: | 9781466585836 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV041159152 | ||
003 | DE-604 | ||
005 | 20131015 | ||
007 | t | ||
008 | 130722s2013 d||| |||| 00||| eng d | ||
020 | |a 9781466585836 |9 978-1-4665-8583-6 | ||
035 | |a (OCoLC)856835362 | ||
035 | |a (DE-599)BSZ380980215 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-83 |a DE-473 |a DE-2070s |a DE-703 |a DE-824 | ||
084 | |a SK 850 |0 (DE-625)143263: |2 rvk | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
100 | 1 | |a Xu, Guandong |e Verfasser |0 (DE-588)142907936 |4 aut | |
245 | 1 | 0 | |a Applied data mining |c Guandong Xu ; Yu Zong ; Zhenglu Yang |
264 | 1 | |a Boca Raton [u.a.] |b CRC Press |c 2013 | |
300 | |a XI, 272 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Datenaufbereitung |0 (DE-588)4148865-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Big Data |0 (DE-588)4802620-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenmanagement |0 (DE-588)4213132-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Big Data |0 (DE-588)4802620-7 |D s |
689 | 0 | 1 | |a Datenmanagement |0 (DE-588)4213132-7 |D s |
689 | 0 | 2 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 3 | |a Datenaufbereitung |0 (DE-588)4148865-9 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Zong, Yu |e Verfasser |0 (DE-588)1038286484 |4 aut | |
700 | 1 | |a Yang, Zhenglu |e Verfasser |0 (DE-588)1038286549 |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026134447&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-026134447 |
Datensatz im Suchindex
_version_ | 1804150564833984512 |
---|---|
adam_text | Contents
Preface
v
Part I: Fundamentals
1.
Introduction
3
1.1
Background
3
1.1.1
Data Mining
—
Definitions and Concepts
4
1.1.2
Data Mining Process
6
1.1.3
Data Mining Algorithms
10
1.2
Organization of the Book
16
1.2.1
Part
1:
Fundamentals
17
1.2.2
Part
2:
Advanced Data Mining
18
1.2.3
Part
3:
Emerging Applications
19
1.3
The Audience of the Book
19
2.
Mathematical Foundations
21
2.1
Organization of Data
21
2.1.1
Boolean Model
22
2.1.2
Vector Space Model
22
2.1.3
Graph Model
23
2.1.4
Other Data Structures
26
2.2
Data Distribution
27
2.2.1
Univariate Distribution
27
2.2.2
Multivariate Distribution
28
2.3
Distance Measures
29
2.3.1
Jaccard distance
30
2.3.2
Euclidean Distance
30
2.3.3
Minkowski Distance
31
2.3.4
Chebyshev Distance
32
2.3.5
Mahalanobis Distance
32
2.4
Similarity Measures
33
2.4.1
Cosine Similarity
33
2.4.2
Adjusted Cosine Similarity
34
viii
Applied Data Mining
2.4.3
Kullback-Leibler Divergence
35
2.4.4
Model-based Measures
37
2.5
Dimensionality Reduction
38
2.5.1
Principal Component Analysis
38
2.5.2
Independent Component Analysis
40
2.5.3
Non-negative Matrix Factorization
41
2.5.4
Singular Value Decomposition
42
2.6
Chapter Summary
43
3.
Data Preparation
45
3.1
Attribute Selection
46
3.1.1
Feature Selection
46
3.1.2
Discretizing Numeric Attributes
49
3.2
Data Cleaning and Integrity
50
3.2.1
Missing Values
50
3.2.2
Detecting Anomalies
51
3.2.3
Applications
52
3.3
Multiple Model Integration
53
3.3.1
Data Federation
53
3.3.2
Bagging and Boosting
54
3.4
Chapter Summary
55
4.
Clustering Analysis
57
4.1
Clustering Analysis
57
4.2
Types of Data in Clustering Analysis
59
4.2.1
Data Matrix
59
4.2.2
The Proximity Matrix
61
4.3
Traditional Clustering Algorithms
63
4.3.1
Partitional methods
63
4.3.2
Hierarchical Methods
68
4.3.3
Density-based methods
74
4.3.4
Grid-based Methods
77
4.3.5
Model-based Methods
80
4.4
High-dimensional clustering algorithm
83
4.4.1
Bottom-up Approaches
84
4.4.2
Top-down Approaches
86
4.4.3
Other Methods
88
4.5
Constraint-based Clustering Algorithm
89
4.5.1
COP K-means
90
4.5.2
MPCK-means
90
4.5.3
AFCC
91
4.6
Consensus Clustering Algorithm
92
4.6.1
Consensus Clustering Framework
93
4.6.2
Some Consensus Clustering Methods
95
4.7
Chapter Summary
96
Contents ix
5.
Classification
100
5.1
Classification
Definition and Related Issues
101
5.2
Decision Tree and Classification
103
5.2.1
Decision Tree
103
5.2.2
Decision Tree Classification
105
5.2.3
Hunt s Algorithm
106
5.3
Bayesian Network and Classification
107
5.3.1
Bayesian Network
107
5.3.2
Backpropagation and Classification
109
5.3.3
Association-based Classification
110
5.3.4
Support Vector Machines and Classification
112
5.4
Chapter Summary
115
6.
Frequent Pattern Mining
117
6.1
Association Rule Mining
117
6.1.1
Association Rule Mining Problem
118
6.1.2
Basic Algorithms for Association Rule Mining
120
6.2
Sequential Pattern Mining
124
6.2.1
Sequential Pattern Mining Problem
125
6.2.2
Existing Sequential Pattern Mining Algorithms
126
6.3
Frequent Subtree Mining
137
6.3.1
Frequent Subtree Mining Problem
137
6.3.2
Data Structures for Storing Trees
138
6.3.3
Maximal and closed frequent subtrees
141
6.4
Frequent Subgraph Mining
142
6.4.1
Problem Definition
142
6.4.2
Graph Representation
143
6.4.3
Candidate Generation
144
6.4.4
Frequent Subgraph Mining Algorithms
145
6.5
Chapter Summary
146
Part II: Advanced Data Mining
7.
Advanced Clustering Analysis
153
7.1
Introduction
153
7.2
Space Smoothing Search Methods in Heuristic Clustering
155
7.2.1
Smoothing Search Space and Smoothing Operator
156
7.2.2
Clustering Algorithm based on Smoothed Search Space
161
7.3
Using Approximate Backbone for Initializations in Clustering
163
7.3.1
Definitions and Background of Approximate Backbone
164
7.3.2
Heuristic Clustering Algorithm based on
167
Approximate Backbone
7.4
Improving Clustering Quality in High Dimensional Space
169
7.4.1
Overview of High Dimensional Clustering
169
x
Applied
Data Mining
7.4.2
Motivation of our Method
171
7.4.3
Significant Local Dense Area
171
7.4.4
Projective
Clustering based on SLDAs
175
7.5
Chapter Summary
178
8.
Multi-Label Classification
181
8.1
Introduction
181
8.2
What is Multi-label Classification
182
8.3
Problem Transformation
184
8.3.1
Binary Relevance and Label
Powerset 185
8.3.2
Classifier Chains and Probabilistic Classifier Chains
187
8.3.3
Decompose the Label Set
189
8.3.4
Transform Original Label Space to Another Space
191
8.4
Algorithm Adaptation
192
8.4.1
KNN-based methods
192
8.4.2
Learn the Label Dependencies by the Statistical Models
194
8.5
Evaluation Metrics and
Datasets
195
8.5.1
Evaluation Metrics
195
8.5.2
Benchmark
Datasets
and the Statistics
199
8.6
Chapter Summary
200
9.
Privacy Preserving in Data Mining
204
9.1
The K-Anonymity Method
204
9.2
The 1-Diversity Method
208
9.3
The
ŕ-Closeness
Method
210
9.4
Discussion and Challenges
211
9.5
Chapter Summary
211
Part III: Emerging Applications
10.
Data Stream
215
10.1
General Data Stream Models
215
10.2
Sampling Approach
216
10.2.1
Random Sampling
218
10.2.2
Cluster Sampling
219
10.3
Wavelet Method
220
10.4
Sketch Method
222
10.4.1
Sliding Window-based Sketch
223
10.4.2
Count Sketch
224
10.4.3
Fast Count Sketch
225
10.4.4
Count
Min
Sketch
225
10.4.5
Some Related Issues on Sketches
226
10.4.6
Applications of Sketches
227
10.4.7
Advantages and Limitations of Sketch Strategies
227
Contents xi
10.5
Histogram
Method
228
10.5.1
Dynamic Construction of Histograms
230
10.6
Discussion
231
10.7
Chapter Summary
232
11.
Recommendation Systems
236
11.1
Collaborative Filtering
236
11.1.1
Memory-based Collaborative Recommendation
237
11.1.2
Model-based Recommendation
238
11.2
PLSA Method
238
11.2.1
User Pattern Extraction and Latent Factor
240
Recognition
11.3
Tensor Model
242
11.4
Discussion and Challenges
244
11.4.1
Security and Privacy Issues
244
11.4.2
Effectiveness Issue
245
11.5
Chapter Summary
246
12.
Social Tagging Systems
248
12.1
Data Mining and Information Retrieval
248
12.2
Recommender Systems
250
12.2.1
Recommendation Algorithms
251
12.2.2
Tag-Based Recommender Systems
254
12.3
Clustering Algorithms in Recommendation
257
12.3.1
K-means Algorithm
257
12.3.2
Hierarchical Clustering
259
12.3.3
Spectral Clustering
260
12.3.4
Quality of Clusters and Modularity Method
261
12.3.5
K-Nearest-Neighboring
263
12.4
Clustering Algorithms in Tag-Based Recommender Systems
264
12.5
Chapter Summary
266
Index
271
|
any_adam_object | 1 |
author | Xu, Guandong Zong, Yu Yang, Zhenglu |
author_GND | (DE-588)142907936 (DE-588)1038286484 (DE-588)1038286549 |
author_facet | Xu, Guandong Zong, Yu Yang, Zhenglu |
author_role | aut aut aut |
author_sort | Xu, Guandong |
author_variant | g x gx y z yz z y zy |
building | Verbundindex |
bvnumber | BV041159152 |
classification_rvk | SK 850 ST 530 |
ctrlnum | (OCoLC)856835362 (DE-599)BSZ380980215 |
discipline | Informatik Mathematik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01748nam a2200421 c 4500</leader><controlfield tag="001">BV041159152</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20131015 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">130722s2013 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781466585836</subfield><subfield code="9">978-1-4665-8583-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)856835362</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BSZ380980215</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-83</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-2070s</subfield><subfield code="a">DE-703</subfield><subfield code="a">DE-824</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 850</subfield><subfield code="0">(DE-625)143263:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Xu, Guandong</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)142907936</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Applied data mining</subfield><subfield code="c">Guandong Xu ; Yu Zong ; Zhenglu Yang</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boca Raton [u.a.]</subfield><subfield code="b">CRC Press</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 272 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenaufbereitung</subfield><subfield code="0">(DE-588)4148865-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenmanagement</subfield><subfield code="0">(DE-588)4213132-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenmanagement</subfield><subfield code="0">(DE-588)4213132-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Datenaufbereitung</subfield><subfield code="0">(DE-588)4148865-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zong, Yu</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1038286484</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yang, Zhenglu</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1038286549</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026134447&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-026134447</subfield></datafield></record></collection> |
id | DE-604.BV041159152 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:40:56Z |
institution | BVB |
isbn | 9781466585836 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-026134447 |
oclc_num | 856835362 |
open_access_boolean | |
owner | DE-83 DE-473 DE-BY-UBG DE-2070s DE-703 DE-824 |
owner_facet | DE-83 DE-473 DE-BY-UBG DE-2070s DE-703 DE-824 |
physical | XI, 272 S. graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | CRC Press |
record_format | marc |
spelling | Xu, Guandong Verfasser (DE-588)142907936 aut Applied data mining Guandong Xu ; Yu Zong ; Zhenglu Yang Boca Raton [u.a.] CRC Press 2013 XI, 272 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Datenaufbereitung (DE-588)4148865-9 gnd rswk-swf Big Data (DE-588)4802620-7 gnd rswk-swf Datenmanagement (DE-588)4213132-7 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Big Data (DE-588)4802620-7 s Datenmanagement (DE-588)4213132-7 s Datenanalyse (DE-588)4123037-1 s Datenaufbereitung (DE-588)4148865-9 s DE-604 Zong, Yu Verfasser (DE-588)1038286484 aut Yang, Zhenglu Verfasser (DE-588)1038286549 aut Digitalisierung UB Bamberg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026134447&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Xu, Guandong Zong, Yu Yang, Zhenglu Applied data mining Datenaufbereitung (DE-588)4148865-9 gnd Big Data (DE-588)4802620-7 gnd Datenmanagement (DE-588)4213132-7 gnd Datenanalyse (DE-588)4123037-1 gnd |
subject_GND | (DE-588)4148865-9 (DE-588)4802620-7 (DE-588)4213132-7 (DE-588)4123037-1 |
title | Applied data mining |
title_auth | Applied data mining |
title_exact_search | Applied data mining |
title_full | Applied data mining Guandong Xu ; Yu Zong ; Zhenglu Yang |
title_fullStr | Applied data mining Guandong Xu ; Yu Zong ; Zhenglu Yang |
title_full_unstemmed | Applied data mining Guandong Xu ; Yu Zong ; Zhenglu Yang |
title_short | Applied data mining |
title_sort | applied data mining |
topic | Datenaufbereitung (DE-588)4148865-9 gnd Big Data (DE-588)4802620-7 gnd Datenmanagement (DE-588)4213132-7 gnd Datenanalyse (DE-588)4123037-1 gnd |
topic_facet | Datenaufbereitung Big Data Datenmanagement Datenanalyse |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026134447&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT xuguandong applieddatamining AT zongyu applieddatamining AT yangzhenglu applieddatamining |