Verfügbarkeit: Principles of data mining

Principles of data mining:

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Bramer, Max A. 1948- (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	London Springer [2020]
Ausgabe:	Fourth edition
Schriftenreihe:	Undergraduate topics in computer science
Schlagworte:	Data Mining Data mining Lehrbuch
Online-Zugang:	Inhaltsverzeichnis
Beschreibung:	Literaturangaben
Beschreibung:	xvi, 571 Seiten Illustrationen, Diagramme
ISBN:	9781447174929

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV046913134
003	DE-604
005	20220121
007	t
008	200925s2020 xxka\|\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 9781447174929 \|c (pbk) \|9 978-1-4471-7492-9
035			\|a (OCoLC)1184757055
035			\|a (DE-599)KXP172606784X
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
044			\|a xxk \|c XA-GB
049			\|a DE-83 \|a DE-11 \|a DE-355
082	0		\|a 025.04
084			\|a ST 530 \|0 (DE-625)143679: \|2 rvk
084			\|a QH 500 \|0 (DE-625)141607: \|2 rvk
084			\|a 54.72 \|2 bkl
084			\|a 54.62 \|2 bkl
084			\|a 54.64 \|2 bkl
100	1		\|a Bramer, Max A. \|d 1948- \|e Verfasser \|0 (DE-588)121430855 \|4 aut
245	1	0	\|a Principles of data mining \|c Max Bramer
250			\|a Fourth edition
264		1	\|a London \|b Springer \|c [2020]
300			\|a xvi, 571 Seiten \|b Illustrationen, Diagramme
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
490	0		\|a Undergraduate topics in computer science
500			\|a Literaturangaben
650	0	7	\|a Data Mining \|0 (DE-588)4428654-5 \|2 gnd \|9 rswk-swf
653		0	\|a Data mining
655		7	\|0 (DE-588)4123623-3 \|a Lehrbuch \|2 gnd-content
689	0	0	\|a Data Mining \|0 (DE-588)4428654-5 \|D s
689	0		\|5 DE-604
776	0	8	\|i Erscheint auch als \|n Online-Ausgabe \|z 978-1-4471-7493-6
856	4	2	\|m HEBIS Datenaustausch \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032322538&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
943	1		\|a oai:aleph.bib-bvb.de:BVB01-032322538

Datensatz im Suchindex

_version_	1805088178060656640
adam_text	Max Bramer Principles of Data Mining Fourth Edition a Springer Contents Introduction to Data Mining -- 2-2 eee eee eee 1 1 1 The Data Explosion 06 cece eee eee eee e ee neee 1 1 2 Knowledge Discovery 220s cece nee e eer t tees 2 1 3 Applications of Data Mining - - 0-2 eee eens 3 1 4 Labelled and Unlabelled Data - - 02-02 eee eee eee 4 1 5 Supervised Learning: Classification tne eee eeeenee 5 1 6 Supervised Learning: Numerical Prediction -- 7 1 7 Unsupervised Learning: Association Rules - 7 1 8 Unsupervised Learning: Clustering --+-+2---+00e+- 8 Data for Data Mining --sseeeeesseeeeeenan nenn 9 2 1 Standard Formulation 26 2000 e eee eee ee eens 9 2 2 Types of Variable 22 c eee eee eee eee eee eee 10 221 Categorical and Continuous Attributes - 12 2 3 Data Preparation 0 0 000 c cece eee teeter eee eres 12 231 Data Cleaning 0 00 e cece ee cece eee eeee 13 2 4 Missing Values 0 2 cece eect e ee ee eee renee eee 15 241 Discard Instances 000 cece eee e cern ee eeees 15 242 Replace by Most Frequent/Average Value - 15 2 5 Reducing the Number of Attributes ----+-+-+-- 16 2 6 The UCI Repository of Datasets 0 -- eee e eee eens 17 2 7 Chapter Summary - 000 e eee e eee nee rere eeee 18 2 8 Self-assessment Exercises for Chapter 2 - -------+00+- 18 Reference 2 cece cee ete teen ener een ten eens 19 vii viii Principles of Data Mining Introduction to Classification: Naive Bayes and Nearest 1 Neighbour ces eee tee eee eee 2 31 What Is Classification? - een nennt i 3 2 Naive Bayes Olassifiers nennen 2 3 3 Nearest Neighbour Classification een] 2 331 Distance Measures ee een nn nt 332 Normalisation 222 ee nennt tn nt 35 333 Dealing with Categorical Attributes 77 36 3 4 Eager and Lazy Leaming een tn 36 3 5 Chapter Summary ernennen nenn 37 3 6 Self-assessment Exercises for Chapter 3 ent! 37 Using Decision Trees for Classification rettet! 39 4 1 Decision Rules and Decision Trees - een! 39 411 Decision Trees: The Golf Example - ren 40 41 2 Terminology essen tn tt 4l 413 The degrees Dataset een nt 42 42 The TDIDT Algorithm ee een nn nt 45 4 3 Types of Reasoning nee een tt nn 47 4 4 Chapter Summary een een een n tnt 48 4 5 Self-assessment Exercises for Chapter 4 re rnerrent 48 References 0 c cece cece cece eee een ernennen tt nt 48 Decision Tree Induction: Using Entropy for Attribute Selection 2 00 00c ccc ccc cee cee eee renee eee eer eres 49 5 1 Attribute Selection: An Experiment -----0+ secre 49 5 2 Alternative Decision Trees --20e eee eee errr 50 521 The Football/Netball Example - erst 51 522 The anonymous Dataset ---6-0 02 eet 53 5 3 Choosing Attributes to Split On: Using Entropy ------+---- 54 531 The lens24 Dataset -- - 20 cece ener ee tte 5 532 Entropy 2022 essen ernennen nennt 57 533 Using Entropy for Attribute Selection --- - 0° 58 534 Maximising Information Gain - ---reeeeeet 60 5 4 Chapter Summary 2eeceeeenener ernennen nenn nn nn 61 5 5 Self-assessment Exercises for Chapter 5 er reee 61 Decision Tree Induction: Using Frequency Tables for Attribute Selection -2ereseeeeeeneree rennen net 63 6 1 Calculating Entropy in Practice - er -eenennnneenen 63 611 Proof of Equivalence -- -- eee eeeee reer ree 64 612A Note on Zeros 2 22 cece ce eee nenne nennt 66 Contents 6 2 Other Attribute Selection Criteria: Gini Index of Diversity 66 6 3 The x? Attribute Selection Criterion - 0-- 68 6 4 Inductive Bias 006 eee eee tee ne ene 71 6 5 Using Gain Ratio for Attribute Selection 6- 73 651 Properties of Split Information 006- 74 652 Summary ccc cence eee eects 75 6 6 Number of Rules Generated by Different Attribute Selection Criteria 26 cc cee een eee e eee eee 75 6 7 Missing Branches 60: e cece eee eect eee eens 76 6 8 Chapter Summary 0 00 ccc e cece ne eee eee eeee 77 6 9 Self-assessment Exercises for Chapter 6 2rccrseenn 77 References 2000 cc cece ccc eet ee nent eee eens 78 7 Estimating the Predictive Accuracy of a Classifier 79 7 1 Introduction 0 cece eee eee eee eee e teens 79 7 2 Method 1: Separate Training and Test Sets - 80 721 Standard Error 0 eee cee cee eee teens 81 722 Repeated Train and Test 0 c cece eee 82 7 3 Method 2: k-fold Cross-validation 6 eee eeeeee 82 7 4 Method 3: N-fold Cross-validation 6 : cece eee ee eee 83 7 5 Experimental Results 1 0 cece eee eee ee cee eee 84 7 6 Experimental Results II: Datasets with Missing Values 86 761 Strategy 1: Discard Instances 000 02 eevee eee 87 762 Strategy 2: Replace by Most Frequent/Average Value 87 763 Missing Classifications 000 sees cence eeee 89 7 7 Confusion Matrix 2 00 cee eee eee eee e ene 89 771 True and False Positives 6 06: e eee ee ee eeeee 90 7 8 Chapter Summary 0 cece cece eee een e eens 91 7 9 Self-assessment Exercises for Chapter 7 - -000e eee 91 Reference 0 0 00 ccc cece ee eee een eee e een ete eees 92 8 Continuous Attributes 222cseneeeeeeennenense runs 93 8 1 Introduction 00 cece eee eee ene eens 93 8 2 Local versus Global Discretisation eee eens 95 8 3 Adding Local Discretisation to TDIDT -- 96 831 Calculating the Information Gain of a Set of Pseudo- attributes 0 0 cc cece cee eet e ences 97 832 Computational Efficiency 0 cee cece ee eeeee 102 8 4 Using the ChiMerge Algorithm for Global Discretisation 105 841 Calculating the Expected Values and x? 108 842 Finding the Threshold Value 2 c22000 113 843 Setting minIntervals and mazIntervals -- 113 10 Principles of Data Mining 8 44 The ChiMerge Algorithm: Summary ent 115 845 The ChiMerge Algorithm: Comments 771° 115 8 5 Comparing Global and Local Discretisation for Tree Induction 116 8 6 Chapter Summary een 118 8 7 Self-assessment Exereises for Chapter 8 nennt! 118 Reference cece ee ee en nn ee 119 Avoiding Overfitting of Decision Trees nett 121 9 1 Dealing with Clashes in a Training Set eeeennnenunt! 122 911 Adapting TDIDT to Deal with Clashes +++ 122 9 2 More About Overfitting Rules to Data eeeeennnt 127 9 3 Prepruning Deeision Trees eeeeenennnnenntt 128 9 4 Post-pruning Decision Trees een 130 9 5 Chapter Summary - --0e cece ee reer errr 135 9 6 Self-assessment Exercise for Chapter 9 -- ----s00 cre? 136 References 0 cece eee cece reece eee ence ene nesses eee eee 136 More About Entropy --22 ee eee eee ent nn 137 10 1 Introduction 002 c eee eee eee eee eee 137 10 2 Coding Information Using Bits ---- reset 140 10 3 Discriminating Amongst M Values (M Not a Power of 2) --- 142 10 4 Encoding Values That Are Not Equally Likely ------++++- 143 10 5 Entropy of a Training Set -- 22 ee cere ert 146 10 6 Information Gain Must Be Positive or Zero ---rererrt 147 10 7 Using Information Gain for Feature Reduction for Classifica- tion Tasks 20 00 0020 ccc cee ct eee eee eee nent 149 10 7 1 Example 1: The genetics Dataset ----r-r rer 150 10 7 2 Example 2: The best96 Dataset ------Heerere 154 10 8 Chapter Summary 000 eee cee eter erent? 156 10 9 Self-assessment Exercises for Chapter 10 --errret 156 References 00c cece cece renee eee e ee eee ene ene see eeeee 156 Inducing Modular Rules for Classification - +++ 157 11 1 Rule Post-pruning 02 eee eee eee ee eee nett 157 11 2 Conflict Resolution 0 000 c cee eee eee ne rennen 159 11 3 Problems with Decision Trees 2 20 00 0eeeeere rete? 162 11 4 The Prism Algorithm 0 0 cee eee eee eee nents 164 11 4 1 Changes to the Basic Prism Algorithm - --+-- 171 11 4 2 Comparing Prism with TDIDT -0-- 052+ 172 11 5 Chapter Summary 0 ccc cece eee eee eens 173 11 6 Self-assessment Exercise for Chapter 11 --r -e nee: 173 References 22 occ cc cece cee cee e een ee ene een ene neeees 174 Contents xi Measuring the Performance of a Classifier - 175 12 1 True and False Positives and Negatives 000000 176 12 2 Performance MeasureS 0 0c eee c eee e eens 178 12 3 True and False Positive Rates versus Predictive Accuracy 181 12 4 ROC Graphs 0 0 cece cece eee ee ete n eee ees 182 12 5 ROC Curves 00 c cece e eee ceen eee nent eens 184 12 6 Finding the Best Classifier 0 ccc eee ee erence nee 185 12 7 Chapter Summary eee eee eee eee ee eee eee 186 12 8 Self-assessment Exercise for Chapter 12 222 0 187 Dealing with Large Volumes of Data -- r-c 00 189 13 1 Introduction --- --- 2rueeseeseeeenenenenennn ernennen 189 13 2 Distributing Data onto Multiple Processors ++- 192 13 3 Case Study: PMCRI 2 e cece eee eee eens 194 13 4 Evaluating the Effectiveness of a Distributed System: PMCRI 197 13 5 Revising a Classifier Incrementally - --+ 2400s eee 201 13 6 Chapter Summary 000 e eee eee reenter ences 207 13 7 Self-assessment Exercises for Chapter 13 -- 2-45- 207 References 000 cece eee nee een enna nent ne eee 208 Ensemble Classification 000s: cece cece erence 209 14 1 Introduction 0 00 eee eee eee eens 209 14 2 Estimating the Performance of a Classifier +-- 212 14 3 Selecting a Different Training Set for Each Classifier 213 14 4 Selecting a Different Set of Attributes for Each Classifier 214 14 5 Combining Classifications: Alternative Voting Systems 215 14 6 Parallel Ensemble Classifiers 0 00 2c eee rere e eee 219 14 7 Chapter Summary eeseeeeseereeenenen rennen nennen nn 219 14 8 Self-assessment Exereises for Chapter 14 - --cccr 220 References 200c ccc ccc cece een ene teeter eee eneneee 220 Comparing Classifiers 0 00 6 0 ccc ener een cence neces 221 15 1 Introduction -2:222 ses eeneeeeeneenenne nennen en nn 221 15 2 The Paired t-Test - -- -- e2--2200neeeeerer nenn nn 223 15 3 Choosing Datasets for Comparative Evaluation -- 229 15 3 1 Confidence Intervals -- -2222ssneeeen nenn 231 15 4 Sampling eseceseeeeennenneeesn een enten een 231 15 5 How Bad Is a ‘No Significant Difference’ Result? - 234 15 6 Chapter Summary cece eee renee eee nnn ene 235 15 7 Self-assessment Exercises for Chapter 15 - -+- +005 235 References 000c ccc e cect en eee tee e eee n ene eeenes 236 xii 16 Association Rule Mining I Principles of Data Mining, 16 1 Introduction 0 0 ccc eee eee 16 2 Measures of Rule Interestingness ----+seer rrr 239 16 2 1 The Piatetsky-Shapiro Criteria and the RI Measure - 241 16 2 2 Rule Interestingness Measures Applied to the chess Dataset ccc cece eee eee eee eerste 243 16 2 3 Using Rule Interestingness Measures for Conflict Res- olution 0 eee cee ee eee eee eet 245 16 3 Association Rule Mining Tasks + enenntent 245 16 4 Finding the Best N Rules een net 246 16 4 1 The J-Measure: Measuring the Information Content ofa Rule eee eee eee ee eee etree 247 16 4 2 Search Strategy 2 eeesenerenntntnnt 248 16 5 Chapter Summary -2ceeee ee nenese nennen nn 251 16 6 Self-assessment Exercises for Chapter 16 - - rent 251 References ccc cece cee eee eee ee eee ee eee eee tee ee 251 Association Rule Mining IT ------eeeer err 253 17 1 Introduction 2 eee eee renner 253 17 2 Transactions and Itemsets -- r rs eeer nennt 254 17 3 Support for an Itemset - - 2er een eesn sense nt 255 17 4 Association Rules -- -2er- ee essen nennen nn 256 17 5 Generating Association Rules re eeennnnntnnnt 258 17 6 Apriori 0 0 eee eee eee nee nent eee 259 17 7 Generating Supported Itemsets: An Example + 262 17 8 Generating Rules for a Supported Itemset ----+-+-+ +7777 264 17 9 Rule Interestingness Measures: Lift and Leverage -------+ - 266 17 10 Chapter Summary - 60 2 eee eee e err 268 17 11 Self-assessment Exercises for Chapter 17 - ec reernt 269 Reference 000 ccc cee eee eee eee ene een neta ee 269 Association Rule Mining III: Frequent Pattern Trees ----- 271 18 1 Introduction: FP-Growth - 0002 e ee eer 271 18 2 Constructing the FP-tree 2-2 ee cere erent 274 18 2 1 Pre-processing the Transaction Database ++ 274 18 2 2 Initialisation 0 2 cece eee er ert 276 18 2 3 Processing Transaction 1: f,c, a, Mm, P nenne 277 18 2 4 Processing Transaction 2: f, a,b, m nen 279 18 2 5 Processing Transaction 3: f, b-- sere rere tre 283 18 2 6 Processing Transaction 4: ¢, 6, DP ----sseer erect 285 18 2 7 Processing Transaction 5: f, , MP teren 237 18 3 Finding the Frequent Itemsets from the FP-tree ++-- 288 Contents x 18 3 1 Itemsets Ending with Item p -----0- eee eres 291 18 3 2 Itemsets Ending with Item m -- -0---+- 301 18 4 Chapter Summary -- 22s eee eee eee eee eee eens 308 18 5 Self amp;assessment Exercises for Chapter 18 - -- 309 Reference 0 cece cee tener ete teen enneeenneeee 309 Clustering 5 0060 c eect erence eee e eee nee eas 311 19 1 Introduction - 00 e cece ene eee eee eee eeeeeee 311 19 2 k-Means Clustering 0-- cece cece cece renee ee eeee 314 19 2 1 Example 6 e cee eee eee e reer er een eens 315 19 2 2 Finding the Best Set of Clusters ---+- 319 19 3 Agglomerative Hierarchical Clustering -------+seeeees 320 19 3 1 Recording the Distance Between Clusters - 323 19 3 2 Terminating the Clustering Process - - 326 19 4 Chapter Summary - 00 cece reece cee e reer eee 327 19 5 Selfassessment Exercises for Chapter 19 - - -----+5+ 327 Text Mining - ::2 2ceseeeneesene en rer ernennen en nnenn 329 20 1 Multiple Classifications - 00 sce e ere eee eee renee eee 329 20 2 Representing Text Documents for Data Mining -- 330 20 3 Stop Words and Stemming - -- -- +0 +e sees eee eee e ees 332 20 4 Using Information Gain for Feature Reduction - 333 20 5 Representing Text Documents: Constructing a Vector Space IY) Cole Cc) I 333 20 6 Normalising the Weights - - 6-- se eee eee e ener cree 335 20 7 Measuring the Distance Between Two Vectors -++- 336 20 8 Measuring the Performance of a Text Classifier - - 337 20 9 Hypertext Categorisation -6e eee reece eee eee ees 338 20 9 1 Classifying Web Pages -- - -- +20 ee cece eee eeee 338 20 9 2 Hypertext Classification versus Text Classification 339 20 10 Chapter Summary --- 02 eee etter eee ener eee ees 343 20 11 Self-assessment Exercises for Chapter 20 --- --+-+--- 343 Classifying Streaming Data - 606e eee ree eee e eee ee 345 21 1 Introduction 0 00 tere e een eneee 345 21 1 1 Stationary v Time-dependent Data -- 347 21 2 Building an H-Tree: Updating Arrays - ---+0--seeeeee 347 21 2 1 Array currentAtts -2 ee cece eee eee eect eee 348 21 2 2 Array splitAtt 20 cece e eee e eee nennen nn 349 21 2 3 Sorting a record to the appropriate leaf node 349 21 2 4 Array hitcount 6 cee eee cee eee nee en eees 350 21 2 5 Array classtotals -- 2c eee eee ee eee ee eee eee eee 350 xiv Principles of Data Minin 21 2 6 Array acvlCounls eeneneeeeeeeeeeene ee nen en et 350 91 2 7 Array branch oo eee eee rennet 352 231 3 Building an H-Tree: a Detailed Example »+--++- 352 91 3 1 Step (a): Initialise Root Node O Herner 352 91 3 2 Step (b): Begin Reading Records + 353 21 3 3 Step (c): Consider Splitting at Node 0 +---++- 354 91 3 4 Step (d): Split on Root Node and Initialise New Leaf Nodes - ce eetcc tee ccc ec eee cece eee eee ec neeeenre 355 21 3 5 Step (e): Process the Next Set of Records ------ 357 21 3 6 Step (f): Consider Splitting at Node 2 ++- 358 21 3 7 Step (g): Process the Next Set of Records ----- 359 21 3 8 Outline of the H-Tree Algorithm + +- 360 21 4 Splitting on an Attribute: Using Information Gain ---- 363 21 5 Splitting on An Attribute: Using a Hoeffding Bound ---- 365 21 6 H-Tree Algorithm: Final Version 02000022 000? 370 21 7 Using an Evolving H-Tree to Make Predictions - --+- 372 21 7 1 Evaluating the Performance of an H-Tree --+- 373 21 8 Experiments: H-Tree versus TDIDT -+- 374 21 8 1 The lens24 Dataset 000 00 0000 cece eee 374 21 8 2 The vote Dataset 00e ccc cece eee eee eee 376 21 9 Chapter Summary 0000 0000 cece cece ee eee eeeee 377 21 10 Self-assessment Exereises for Chapter 21 - -+-- 377 References 20 00c ecco eee e cece ences veusnvesnenenenee 378 Classifying Streaming Data II: Time-Dependent Data 379 22 1 Stationary versus Time-dependent Data - 2 -+020+- 379 22 2 Summary of the H-Tree Algorithm 20000 00 02 cece eee 381 22 2 1 Array currentAtts 00 00000 c cece cece cece eee eeees 382 22 2 2 Array splitAtt ©0200 000000 cece cece cece ee eeeeee 383 22 2 3 Array hitcount 0 00 0 c cece ence eee eeueeees 383 22 2 4 Array classtotals 0 00 0 cece cee ueeeceueeee 383 22 2 5 Array acuCounts 0 00 cc cece eccecuvceeeees 384 22 2 6 Array branch 0 00 00 0c ec cece eee eceeeees 384 22 2 7 Pseudocode for the H-Tree Algorithm 384 22 3 From H-Tree to CDH-Tree: Overview 0000000eee ees 387 22 4 From H-Tree to CDH-Tree: Incrementing Counts --- 387 22 5 The Sliding Window Method 2 2e0ceeeecereees 388 22 6 Resplitting at Nodes 00 00 00 cece cueeeceuceesees 393 22 7 Identifying Suspect Nodes 0 00 000 cece eee eeees 394 22 8 Creating Alternate Nodes 00 0 c cece cece eueeeeees 396 22 9 Growing/Forgetting an Alternate Node and its Descendants 400 Contents 22 10 Replacing an Internal Node by One of its Alternate Nodes 22 11 Experiment: Tracking Concept Drift 0+- eee eee 22 11 1 lens24 Data: Alternative Mode 2z220r200: 22 11 2 Introducing Concept Drift 000 eee eee 22 11 38 An Experiment with Alternating lens24 Data 22 11 4 Comments on Experiment 00c cece eee eee 22 12 Chapter Summary 0 6 cece eee eee eee renee 22 13 Self-assessment Exercises for Chapter 22 00 ee eeeee Reference 200 ec cece cece cette ene eee eee e nee tees eeenes An Introduction to Neural Networks 6 2200s eeee 23 1 Introduction 0 cece cece eee cnet eeeeee 23 2 Neural Nets Example 100e eee seve ee eeeeee 23 3 Neural Nets Example 200e eee cece e eters 23 3 1 Forward Propagating the Values of the Input Nodes 23 3 2 Forward Propagation: Summary of Formulae 23 4 Backpropagation 0 cece cece etter e tees 23 4 1 Stochastic Gradient Descent -0e-ee eee 23 4 2 Finding the Gradients 6 0 cece ee eee eres 23 4 3 Working backwards from the output layer to the hid- den layer 6 ccc eee cence een e eee enenee 23 4 4 Working backwards from the hidden layer to the input Layer oc ccc eee nee e ete een 23 4 5 Updating the Weights 62: cee eee e eee eee eeee 23 5 Processing a Multi-instance Training Set - --- 23 6 Using a Neural Net for Classification: the iris Dataset 23 7 Using a Neural Net for Classification: the seeds Dataset 23 8 Neural Nets: A Note of Caution --: seer eee rere 23 9 Chapter Summary 00 :e cece eee nee nee eens 23 10 Self-assessment Exercises for Chapter 23 - 0 0000s Essential Mathematics 60 60 c cece ce ete eters A 1 Subscript Notation 2 06 0 c ccc eect e eee ee eee A11 Sigma Notation for Summation ---++4- A12 Double Subscript Notation cece ee eens A13 Other Uses of Subscripts 000s ce eee eee eens AD TreeS 0 ccc ene ne tete eee n ne neneee A21 Terminology 00 cece cece renee rece r ene e eee A22 Interpretation cc cece e cece eee en A23 Subtrees cc cece eee eee eee ee eee eee e teenies A 3 The Logarithm Function loggX - - eee ee eee eee eee A31 The Function —XloggX eee eee ee eee eee xv Principles of Data Mining A A Introduction to Set Theoty aan een 477 AAL Subsets acc 479 A42 Summary of Set Notation eeneenenett 481 Datasets 00 000 00 cccccccccececcecnceseeeeene ree ner 483 References 0 ccecccceeceveceeceuceeeuececneneeesetesnetes 504 Sources of Further Information re eeeeenttnttt 505 Websites 200 ccc eee tee te ee ene erent 505 \| LL) 6 tt 505 Conferences 000 ecececeececeecesecueeeeeeeeeerneenees 506 Information About Association Rule Mining 79 507 Glossary and Notation re een essen 509 Solutions to Self-assessment Exercises ---ntrntn 535
adam_txt	Max Bramer Principles of Data Mining Fourth Edition a Springer Contents Introduction to Data Mining -- 2-2 eee eee eee 1 1 1 The Data Explosion 06 cece eee eee eee e ee neee 1 1 2 Knowledge Discovery 220s cece nee e eer t tees 2 1 3 Applications of Data Mining - - 0-2 eee eens 3 1 4 Labelled and Unlabelled Data - - 02-02 eee eee eee 4 1 5 Supervised Learning: Classification tne eee eeeenee 5 1 6 Supervised Learning: Numerical Prediction -- 7 1 7 Unsupervised Learning: Association Rules - 7 1 8 Unsupervised Learning: Clustering --+-+2---+00e+- 8 Data for Data Mining --sseeeeesseeeeeenan nenn 9 2 1 Standard Formulation 26 2000 e eee eee ee eens 9 2 2 Types of Variable 22 c eee eee eee eee eee eee 10 221 Categorical and Continuous Attributes - 12 2 3 Data Preparation 0 0 000 c cece eee teeter eee eres 12 231 Data Cleaning 0 00 e cece ee cece eee eeee 13 2 4 Missing Values 0 2 cece eect e ee ee eee renee eee 15 241 Discard Instances 000 cece eee e cern ee eeees 15 242 Replace by Most Frequent/Average Value - 15 2 5 Reducing the Number of Attributes ----+-+-+-- 16 2 6 The UCI Repository of Datasets 0 -- eee e eee eens 17 2 7 Chapter Summary - 000 e eee e eee nee rere eeee 18 2 8 Self-assessment Exercises for Chapter 2 - -------+00+- 18 Reference 2 cece cee ete teen ener een ten eens 19 vii viii Principles of Data Mining Introduction to Classification: Naive Bayes and Nearest 1 Neighbour ces eee tee eee eee 2 31 What Is Classification? - een nennt i 3 2 Naive Bayes Olassifiers nennen 2 3 3 Nearest Neighbour Classification een] 2 331 Distance Measures ee een nn nt 332 Normalisation 222 ee nennt tn nt 35 333 Dealing with Categorical Attributes 77 36 3 4 Eager and Lazy Leaming een tn 36 3 5 Chapter Summary ernennen nenn 37 3 6 Self-assessment Exercises for Chapter 3 ent! 37 Using Decision Trees for Classification rettet! 39 4 1 Decision Rules and Decision Trees - een! 39 411 Decision Trees: The Golf Example - ren 40 41 2 Terminology essen tn tt 4l 413 The degrees Dataset een nt 42 42 The TDIDT Algorithm ee een nn nt 45 4 3 Types of Reasoning nee een tt nn 47 4 4 Chapter Summary een een een n tnt 48 4 5 Self-assessment Exercises for Chapter 4 re rnerrent 48 References 0 c cece cece cece eee een ernennen tt nt 48 Decision Tree Induction: Using Entropy for Attribute Selection 2 00 00c ccc ccc cee cee eee renee eee eer eres 49 5 1 Attribute Selection: An Experiment -----0+ secre 49 5 2 Alternative Decision Trees --20e eee eee errr 50 521 The Football/Netball Example - erst 51 522 The anonymous Dataset ---6-0 02 eet 53 5 3 Choosing Attributes to Split On: Using Entropy ------+---- 54 531 The lens24 Dataset -- - 20 cece ener ee tte 5 532 Entropy 2022 essen ernennen nennt 57 533 Using Entropy for Attribute Selection --- - 0° 58 534 Maximising Information Gain - ---reeeeeet 60 5 4 Chapter Summary 2eeceeeenener ernennen nenn nn nn 61 5 5 Self-assessment Exercises for Chapter 5 er reee 61 Decision Tree Induction: Using Frequency Tables for Attribute Selection -2ereseeeeeeneree rennen net 63 6 1 Calculating Entropy in Practice - er -eenennnneenen 63 611 Proof of Equivalence -- -- eee eeeee reer ree 64 612A Note on Zeros 2 22 cece ce eee nenne nennt 66 Contents 6 2 Other Attribute Selection Criteria: Gini Index of Diversity 66 6 3 The x? Attribute Selection Criterion - 0-- 68 6 4 Inductive Bias 006 eee eee tee ne ene 71 6 5 Using Gain Ratio for Attribute Selection 6- 73 651 Properties of Split Information 006- 74 652 Summary ccc cence eee eects 75 6 6 Number of Rules Generated by Different Attribute Selection Criteria 26 cc cee een eee e eee eee 75 6 7 Missing Branches 60: e cece eee eect eee eens 76 6 8 Chapter Summary 0 00 ccc e cece ne eee eee eeee 77 6 9 Self-assessment Exercises for Chapter 6 2rccrseenn 77 References 2000 cc cece ccc eet ee nent eee eens 78 7 Estimating the Predictive Accuracy of a Classifier 79 7 1 Introduction 0 cece eee eee eee eee e teens 79 7 2 Method 1: Separate Training and Test Sets - 80 721 Standard Error 0 eee cee cee eee teens 81 722 Repeated Train and Test 0 c cece eee 82 7 3 Method 2: k-fold Cross-validation 6 eee eeeeee 82 7 4 Method 3: N-fold Cross-validation 6 : cece eee ee eee 83 7 5 Experimental Results 1 0 cece eee eee ee cee eee 84 7 6 Experimental Results II: Datasets with Missing Values 86 761 Strategy 1: Discard Instances 000 02 eevee eee 87 762 Strategy 2: Replace by Most Frequent/Average Value 87 763 Missing Classifications 000 sees cence eeee 89 7 7 Confusion Matrix 2 00 cee eee eee eee e ene 89 771 True and False Positives 6 06: e eee ee ee eeeee 90 7 8 Chapter Summary 0 cece cece eee een e eens 91 7 9 Self-assessment Exercises for Chapter 7 - -000e eee 91 Reference 0 0 00 ccc cece ee eee een eee e een ete eees 92 8 Continuous Attributes 222cseneeeeeeennenense runs 93 8 1 Introduction 00 cece eee eee ene eens 93 8 2 Local versus Global Discretisation eee eens 95 8 3 Adding Local Discretisation to TDIDT -- 96 831 Calculating the Information Gain of a Set of Pseudo- attributes 0 0 cc cece cee eet e ences 97 832 Computational Efficiency 0 cee cece ee eeeee 102 8 4 Using the ChiMerge Algorithm for Global Discretisation 105 841 Calculating the Expected Values and x? 108 842 Finding the Threshold Value 2 c22000 113 843 Setting minIntervals and mazIntervals -- 113 10 Principles of Data Mining 8 44 The ChiMerge Algorithm: Summary ent 115 845 The ChiMerge Algorithm: Comments 771° 115 8 5 Comparing Global and Local Discretisation for Tree Induction 116 8 6 Chapter Summary een 118 8 7 Self-assessment Exereises for Chapter 8 nennt! 118 Reference cece ee ee en nn ee 119 Avoiding Overfitting of Decision Trees nett 121 9 1 Dealing with Clashes in a Training Set eeeennnenunt! 122 911 Adapting TDIDT to Deal with Clashes +++ 122 9 2 More About Overfitting Rules to Data eeeeennnt 127 9 3 Prepruning Deeision Trees eeeeenennnnenntt 128 9 4 Post-pruning Decision Trees een 130 9 5 Chapter Summary - --0e cece ee reer errr 135 9 6 Self-assessment Exercise for Chapter 9 -- ----s00 cre? 136 References 0 cece eee cece reece eee ence ene nesses eee eee 136 More About Entropy --22 ee eee eee ent nn 137 10 1 Introduction 002 c eee eee eee eee eee 137 10 2 Coding Information Using Bits ---- reset 140 10 3 Discriminating Amongst M Values (M Not a Power of 2) --- 142 10 4 Encoding Values That Are Not Equally Likely ------++++- 143 10 5 Entropy of a Training Set -- 22 ee cere ert 146 10 6 Information Gain Must Be Positive or Zero ---rererrt 147 10 7 Using Information Gain for Feature Reduction for Classifica- tion Tasks 20 00 0020 ccc cee ct eee eee eee nent 149 10 7 1 Example 1: The genetics Dataset ----r-r rer 150 10 7 2 Example 2: The best96 Dataset ------Heerere 154 10 8 Chapter Summary 000 eee cee eter erent? 156 10 9 Self-assessment Exercises for Chapter 10 --errret 156 References 00c cece cece renee eee e ee eee ene ene see eeeee 156 Inducing Modular Rules for Classification - +++ 157 11 1 Rule Post-pruning 02 eee eee eee ee eee nett 157 11 2 Conflict Resolution 0 000 c cee eee eee ne rennen 159 11 3 Problems with Decision Trees 2 20 00 0eeeeere rete? 162 11 4 The Prism Algorithm 0 0 cee eee eee eee nents 164 11 4 1 Changes to the Basic Prism Algorithm - --+-- 171 11 4 2 Comparing Prism with TDIDT -0-- 052+ 172 11 5 Chapter Summary 0 ccc cece eee eee eens 173 11 6 Self-assessment Exercise for Chapter 11 --r -e nee: 173 References 22 occ cc cece cee cee e een ee ene een ene neeees 174 Contents xi Measuring the Performance of a Classifier - 175 12 1 True and False Positives and Negatives 000000 176 12 2 Performance MeasureS 0 0c eee c eee e eens 178 12 3 True and False Positive Rates versus Predictive Accuracy 181 12 4 ROC Graphs 0 0 cece cece eee ee ete n eee ees 182 12 5 ROC Curves 00 c cece e eee ceen eee nent eens 184 12 6 Finding the Best Classifier 0 ccc eee ee erence nee 185 12 7 Chapter Summary eee eee eee eee ee eee eee 186 12 8 Self-assessment Exercise for Chapter 12 222 0 187 Dealing with Large Volumes of Data -- r-c 00 189 13 1 Introduction --- --- 2rueeseeseeeenenenenennn ernennen 189 13 2 Distributing Data onto Multiple Processors ++- 192 13 3 Case Study: PMCRI 2 e cece eee eee eens 194 13 4 Evaluating the Effectiveness of a Distributed System: PMCRI 197 13 5 Revising a Classifier Incrementally - --+ 2400s eee 201 13 6 Chapter Summary 000 e eee eee reenter ences 207 13 7 Self-assessment Exercises for Chapter 13 -- 2-45- 207 References 000 cece eee nee een enna nent ne eee 208 Ensemble Classification 000s: cece cece erence 209 14 1 Introduction 0 00 eee eee eee eens 209 14 2 Estimating the Performance of a Classifier +-- 212 14 3 Selecting a Different Training Set for Each Classifier 213 14 4 Selecting a Different Set of Attributes for Each Classifier 214 14 5 Combining Classifications: Alternative Voting Systems 215 14 6 Parallel Ensemble Classifiers 0 00 2c eee rere e eee 219 14 7 Chapter Summary eeseeeeseereeenenen rennen nennen nn 219 14 8 Self-assessment Exereises for Chapter 14 - --cccr 220 References 200c ccc ccc cece een ene teeter eee eneneee 220 Comparing Classifiers 0 00 6 0 ccc ener een cence neces 221 15 1 Introduction -2:222 ses eeneeeeeneenenne nennen en nn 221 15 2 The Paired t-Test - -- -- e2--2200neeeeerer nenn nn 223 15 3 Choosing Datasets for Comparative Evaluation -- 229 15 3 1 Confidence Intervals -- -2222ssneeeen nenn 231 15 4 Sampling eseceseeeeennenneeesn een enten een 231 15 5 How Bad Is a ‘No Significant Difference’ Result? - 234 15 6 Chapter Summary cece eee renee eee nnn ene 235 15 7 Self-assessment Exercises for Chapter 15 - -+- +005 235 References 000c ccc e cect en eee tee e eee n ene eeenes 236 xii 16 Association Rule Mining I Principles of Data Mining, 16 1 Introduction 0 0 ccc eee eee 16 2 Measures of Rule Interestingness ----+seer rrr 239 16 2 1 The Piatetsky-Shapiro Criteria and the RI Measure - 241 16 2 2 Rule Interestingness Measures Applied to the chess Dataset ccc cece eee eee eee eerste 243 16 2 3 Using Rule Interestingness Measures for Conflict Res- olution 0 eee cee ee eee eee eet 245 16 3 Association Rule Mining Tasks + enenntent 245 16 4 Finding the Best N Rules een net 246 16 4 1 The J-Measure: Measuring the Information Content ofa Rule eee eee eee ee eee etree 247 16 4 2 Search Strategy 2 eeesenerenntntnnt 248 16 5 Chapter Summary -2ceeee ee nenese nennen nn 251 16 6 Self-assessment Exercises for Chapter 16 - - rent 251 References ccc cece cee eee eee ee eee ee eee eee tee ee 251 Association Rule Mining IT ------eeeer err 253 17 1 Introduction 2 eee eee renner 253 17 2 Transactions and Itemsets -- r rs eeer nennt 254 17 3 Support for an Itemset - - 2er een eesn sense nt 255 17 4 Association Rules -- -2er- ee essen nennen nn 256 17 5 Generating Association Rules re eeennnnntnnnt 258 17 6 Apriori 0 0 eee eee eee nee nent eee 259 17 7 Generating Supported Itemsets: An Example + 262 17 8 Generating Rules for a Supported Itemset ----+-+-+ +7777 264 17 9 Rule Interestingness Measures: Lift and Leverage -------+ - 266 17 10 Chapter Summary - 60 2 eee eee e err 268 17 11 Self-assessment Exercises for Chapter 17 - ec reernt 269 Reference 000 ccc cee eee eee eee ene een neta ee 269 Association Rule Mining III: Frequent Pattern Trees ----- 271 18 1 Introduction: FP-Growth - 0002 e ee eer 271 18 2 Constructing the FP-tree 2-2 ee cere erent 274 18 2 1 Pre-processing the Transaction Database ++ 274 18 2 2 Initialisation 0 2 cece eee er ert 276 18 2 3 Processing Transaction 1: f,c, a, Mm, P nenne 277 18 2 4 Processing Transaction 2: f, a,b, m nen 279 18 2 5 Processing Transaction 3: f, b-- sere rere tre 283 18 2 6 Processing Transaction 4: ¢, 6, DP ----sseer erect 285 18 2 7 Processing Transaction 5: f, , MP teren 237 18 3 Finding the Frequent Itemsets from the FP-tree ++-- 288 Contents x 18 3 1 Itemsets Ending with Item p -----0- eee eres 291 18 3 2 Itemsets Ending with Item m -- -0---+- 301 18 4 Chapter Summary -- 22s eee eee eee eee eee eens 308 18 5 Self amp;assessment Exercises for Chapter 18 - -- 309 Reference 0 cece cee tener ete teen enneeenneeee 309 Clustering 5 0060 c eect erence eee e eee nee eas 311 19 1 Introduction - 00 e cece ene eee eee eee eeeeeee 311 19 2 k-Means Clustering 0-- cece cece cece renee ee eeee 314 19 2 1 Example 6 e cee eee eee e reer er een eens 315 19 2 2 Finding the Best Set of Clusters ---+- 319 19 3 Agglomerative Hierarchical Clustering -------+seeeees 320 19 3 1 Recording the Distance Between Clusters - 323 19 3 2 Terminating the Clustering Process - - 326 19 4 Chapter Summary - 00 cece reece cee e reer eee 327 19 5 Selfassessment Exercises for Chapter 19 - - -----+5+ 327 Text Mining - ::2 2ceseeeneesene en rer ernennen en nnenn 329 20 1 Multiple Classifications - 00 sce e ere eee eee renee eee 329 20 2 Representing Text Documents for Data Mining -- 330 20 3 Stop Words and Stemming - -- -- +0 +e sees eee eee e ees 332 20 4 Using Information Gain for Feature Reduction - 333 20 5 Representing Text Documents: Constructing a Vector Space IY) Cole Cc) I 333 20 6 Normalising the Weights - - 6-- se eee eee e ener cree 335 20 7 Measuring the Distance Between Two Vectors -++- 336 20 8 Measuring the Performance of a Text Classifier - - 337 20 9 Hypertext Categorisation -6e eee reece eee eee ees 338 20 9 1 Classifying Web Pages -- - -- +20 ee cece eee eeee 338 20 9 2 Hypertext Classification versus Text Classification 339 20 10 Chapter Summary --- 02 eee etter eee ener eee ees 343 20 11 Self-assessment Exercises for Chapter 20 --- --+-+--- 343 Classifying Streaming Data - 606e eee ree eee e eee ee 345 21 1 Introduction 0 00 tere e een eneee 345 21 1 1 Stationary v Time-dependent Data -- 347 21 2 Building an H-Tree: Updating Arrays - ---+0--seeeeee 347 21 2 1 Array currentAtts -2 ee cece eee eee eect eee 348 21 2 2 Array splitAtt 20 cece e eee e eee nennen nn 349 21 2 3 Sorting a record to the appropriate leaf node 349 21 2 4 Array hitcount 6 cee eee cee eee nee en eees 350 21 2 5 Array classtotals -- 2c eee eee ee eee ee eee eee eee 350 xiv Principles of Data Minin 21 2 6 Array acvlCounls eeneneeeeeeeeeeene ee nen en et 350 91 2 7 Array branch oo eee eee rennet 352 231 3 Building an H-Tree: a Detailed Example »+--++- 352 91 3 1 Step (a): Initialise Root Node O Herner 352 91 3 2 Step (b): Begin Reading Records + 353 21 3 3 Step (c): Consider Splitting at Node 0 +---++- 354 91 3 4 Step (d): Split on Root Node and Initialise New Leaf Nodes - ce eetcc tee ccc ec eee cece eee eee ec neeeenre 355 21 3 5 Step (e): Process the Next Set of Records ------ 357 21 3 6 Step (f): Consider Splitting at Node 2 ++- 358 21 3 7 Step (g): Process the Next Set of Records ----- 359 21 3 8 Outline of the H-Tree Algorithm + +- 360 21 4 Splitting on an Attribute: Using Information Gain ---- 363 21 5 Splitting on An Attribute: Using a Hoeffding Bound ---- 365 21 6 H-Tree Algorithm: Final Version 02000022 000? 370 21 7 Using an Evolving H-Tree to Make Predictions - --+- 372 21 7 1 Evaluating the Performance of an H-Tree --+- 373 21 8 Experiments: H-Tree versus TDIDT -+- 374 21 8 1 The lens24 Dataset 000 00 0000 cece eee 374 21 8 2 The vote Dataset 00e ccc cece eee eee eee 376 21 9 Chapter Summary 0000 0000 cece cece ee eee eeeee 377 21 10 Self-assessment Exereises for Chapter 21 - -+-- 377 References 20 00c ecco eee e cece ences veusnvesnenenenee 378 Classifying Streaming Data II: Time-Dependent Data 379 22 1 Stationary versus Time-dependent Data - 2 -+020+- 379 22 2 Summary of the H-Tree Algorithm 20000 00 02 cece eee 381 22 2 1 Array currentAtts 00 00000 c cece cece cece eee eeees 382 22 2 2 Array splitAtt ©0200 000000 cece cece cece ee eeeeee 383 22 2 3 Array hitcount 0 00 0 c cece ence eee eeueeees 383 22 2 4 Array classtotals 0 00 0 cece cee ueeeceueeee 383 22 2 5 Array acuCounts 0 00 cc cece eccecuvceeeees 384 22 2 6 Array branch 0 00 00 0c ec cece eee eceeeees 384 22 2 7 Pseudocode for the H-Tree Algorithm 384 22 3 From H-Tree to CDH-Tree: Overview 0000000eee ees 387 22 4 From H-Tree to CDH-Tree: Incrementing Counts --- 387 22 5 The Sliding Window Method 2 2e0ceeeecereees 388 22 6 Resplitting at Nodes 00 00 00 cece cueeeceuceesees 393 22 7 Identifying Suspect Nodes 0 00 000 cece eee eeees 394 22 8 Creating Alternate Nodes 00 0 c cece cece eueeeeees 396 22 9 Growing/Forgetting an Alternate Node and its Descendants 400 Contents 22 10 Replacing an Internal Node by One of its Alternate Nodes 22 11 Experiment: Tracking Concept Drift 0+- eee eee 22 11 1 lens24 Data: Alternative Mode 2z220r200: 22 11 2 Introducing Concept Drift 000 eee eee 22 11 38 An Experiment with Alternating lens24 Data 22 11 4 Comments on Experiment 00c cece eee eee 22 12 Chapter Summary 0 6 cece eee eee eee renee 22 13 Self-assessment Exercises for Chapter 22 00 ee eeeee Reference 200 ec cece cece cette ene eee eee e nee tees eeenes An Introduction to Neural Networks 6 2200s eeee 23 1 Introduction 0 cece cece eee cnet eeeeee 23 2 Neural Nets Example 100e eee seve ee eeeeee 23 3 Neural Nets Example 200e eee cece e eters 23 3 1 Forward Propagating the Values of the Input Nodes 23 3 2 Forward Propagation: Summary of Formulae 23 4 Backpropagation 0 cece cece etter e tees 23 4 1 Stochastic Gradient Descent -0e-ee eee 23 4 2 Finding the Gradients 6 0 cece ee eee eres 23 4 3 Working backwards from the output layer to the hid- den layer 6 ccc eee cence een e eee enenee 23 4 4 Working backwards from the hidden layer to the input Layer oc ccc eee nee e ete een 23 4 5 Updating the Weights 62: cee eee e eee eee eeee 23 5 Processing a Multi-instance Training Set - --- 23 6 Using a Neural Net for Classification: the iris Dataset 23 7 Using a Neural Net for Classification: the seeds Dataset 23 8 Neural Nets: A Note of Caution --: seer eee rere 23 9 Chapter Summary 00 :e cece eee nee nee eens 23 10 Self-assessment Exercises for Chapter 23 - 0 0000s Essential Mathematics 60 60 c cece ce ete eters A 1 Subscript Notation 2 06 0 c ccc eect e eee ee eee A11 Sigma Notation for Summation ---++4- A12 Double Subscript Notation cece ee eens A13 Other Uses of Subscripts 000s ce eee eee eens AD TreeS 0 ccc ene ne tete eee n ne neneee A21 Terminology 00 cece cece renee rece r ene e eee A22 Interpretation cc cece e cece eee en A23 Subtrees cc cece eee eee eee ee eee eee e teenies A 3 The Logarithm Function loggX - - eee ee eee eee eee A31 The Function —XloggX eee eee ee eee eee xv Principles of Data Mining A A Introduction to Set Theoty aan een 477 AAL Subsets acc 479 A42 Summary of Set Notation eeneenenett 481 Datasets 00 000 00 cccccccccececcecnceseeeeene ree ner 483 References 0 ccecccceeceveceeceuceeeuececneneeesetesnetes 504 Sources of Further Information re eeeeenttnttt 505 Websites 200 ccc eee tee te ee ene erent 505 \| LL) 6 tt 505 Conferences 000 ecececeececeecesecueeeeeeeeeerneenees 506 Information About Association Rule Mining 79 507 Glossary and Notation re een essen 509 Solutions to Self-assessment Exercises ---ntrntn 535
any_adam_object	1
any_adam_object_boolean	1
author	Bramer, Max A. 1948-
author_GND	(DE-588)121430855
author_facet	Bramer, Max A. 1948-
author_role	aut
author_sort	Bramer, Max A. 1948-
author_variant	m a b ma mab
building	Verbundindex
bvnumber	BV046913134
classification_rvk	ST 530 QH 500
ctrlnum	(OCoLC)1184757055 (DE-599)KXP172606784X
dewey-full	025.04
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	025 - Operations of libraries and archives
dewey-raw	025.04
dewey-search	025.04
dewey-sort	225.04
dewey-tens	020 - Library and information sciences
discipline	Allgemeines Informatik Wirtschaftswissenschaften
discipline_str_mv	Allgemeines Informatik Wirtschaftswissenschaften
edition	Fourth edition
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV046913134</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20220121</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">200925s2020 xxka\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781447174929</subfield><subfield code="c">(pbk)</subfield><subfield code="9">978-1-4471-7492-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1184757055</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KXP172606784X</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">xxk</subfield><subfield code="c">XA-GB</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-83</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-355</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">025.04</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 500</subfield><subfield code="0">(DE-625)141607:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.72</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.62</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.64</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bramer, Max A.</subfield><subfield code="d">1948-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)121430855</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Principles of data mining</subfield><subfield code="c">Max Bramer</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Fourth edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">London</subfield><subfield code="b">Springer</subfield><subfield code="c">[2020]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xvi, 571 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Undergraduate topics in computer science</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturangaben</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Data mining</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4123623-3</subfield><subfield code="a">Lehrbuch</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-4471-7493-6</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032322538&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032322538</subfield></datafield></record></collection>
genre	(DE-588)4123623-3 Lehrbuch gnd-content
genre_facet	Lehrbuch
id	DE-604.BV046913134
illustrated	Illustrated
index_date	2024-07-03T15:28:48Z
indexdate	2024-07-20T09:03:54Z
institution	BVB
isbn	9781447174929
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-032322538
oclc_num	1184757055
open_access_boolean
owner	DE-83 DE-11 DE-355 DE-BY-UBR
owner_facet	DE-83 DE-11 DE-355 DE-BY-UBR
physical	xvi, 571 Seiten Illustrationen, Diagramme
publishDate	2020
publishDateSearch	2020
publishDateSort	2020
publisher	Springer
record_format	marc
series2	Undergraduate topics in computer science
spelling	Bramer, Max A. 1948- Verfasser (DE-588)121430855 aut Principles of data mining Max Bramer Fourth edition London Springer [2020] xvi, 571 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Undergraduate topics in computer science Literaturangaben Data Mining (DE-588)4428654-5 gnd rswk-swf Data mining (DE-588)4123623-3 Lehrbuch gnd-content Data Mining (DE-588)4428654-5 s DE-604 Erscheint auch als Online-Ausgabe 978-1-4471-7493-6 HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032322538&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis
spellingShingle	Bramer, Max A. 1948- Principles of data mining Data Mining (DE-588)4428654-5 gnd
subject_GND	(DE-588)4428654-5 (DE-588)4123623-3
title	Principles of data mining
title_auth	Principles of data mining
title_exact_search	Principles of data mining
title_exact_search_txtP	Principles of data mining
title_full	Principles of data mining Max Bramer
title_fullStr	Principles of data mining Max Bramer
title_full_unstemmed	Principles of data mining Max Bramer
title_short	Principles of data mining
title_sort	principles of data mining
topic	Data Mining (DE-588)4428654-5 gnd
topic_facet	Data Mining Lehrbuch
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032322538&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT bramermaxa principlesofdatamining

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge