R and data mining: examples and case studies
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Amsterdam [u.a.]
Elsevier
2013
|
Ausgabe: | 1. ed. |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XIII, 234 S. graph. Darst. |
ISBN: | 9780123969637 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV040916126 | ||
003 | DE-604 | ||
005 | 20130913 | ||
007 | t | ||
008 | 130327s2013 d||| |||| 00||| eng d | ||
020 | |a 9780123969637 |9 978-0-123-96963-7 | ||
035 | |a (OCoLC)828134419 | ||
035 | |a (DE-599)BVBBV040916126 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-29 |a DE-824 |a DE-384 | ||
082 | 0 | |a 006.312 | |
084 | |a SK 850 |0 (DE-625)143263: |2 rvk | ||
084 | |a ST 601 |0 (DE-625)143682: |2 rvk | ||
100 | 1 | |a Zhao, Yanchang |e Verfasser |4 aut | |
245 | 1 | 0 | |a R and data mining |b examples and case studies |c Yanchang Zhao |
250 | |a 1. ed. | ||
264 | 1 | |a Amsterdam [u.a.] |b Elsevier |c 2013 | |
300 | |a XIII, 234 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a R |g Programm |0 (DE-588)4705956-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
655 | 7 | |8 1\p |0 (DE-588)4522595-3 |a Fallstudiensammlung |2 gnd-content | |
689 | 0 | 0 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | 1 | |a R |g Programm |0 (DE-588)4705956-4 |D s |
689 | 0 | |C b |5 DE-604 | |
856 | 4 | 2 | |m HBZ Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025895356&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-025895356 | ||
883 | 1 | |8 1\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk |
Datensatz im Suchindex
_version_ | 1804150209265008640 |
---|---|
adam_text | Titel: R and data mining
Autor: Zhao, Yanchang
Jahr: 2013
Contents
List of Figures xi
List of Abbreviations xv
1 Introduction 1
1.1 Data Mining 1
1.2 R 2
1.3 Datasets 2
1.3.1 The Iris Dataset 2
1.3.2 The Bodyfat Dataset 3
2 Data Import and Export 5
2.1 Save and Load R Data 5
2.2 Import from and Export to .CSV Files 5
2.3 Import Data from SAS 6
2.4 Import/Export via ODBC 8
2.4.1 Read from Databases 8
2.4.2 Output to and Input from EXCEL Files 9
3 Data Exploration 11
3.1 Have a Look at Data 11
3.2 Explore Individual Variables 13
3.3 Explore Multiple Variables 16
3.4 More Explorations 20
3.5 Save Charts into Files 25
4 Decision Trees and Random Forest 27
4.1 Decision Trees with Package party 27
4.2 Decision Trees with Package rpart 31
4.3 Random Forest 36
5 Regression 41
5.1 Linear Regression 41
5.2 Logistic Regression 47
5.3 Generalized Linear Regression 48
5.4 Non-Linear Regression 50
6 Clustering 51
6.1 The k-Means Clustering 51
6.2 The k-Medoids Clustering 53
viii Contents
6.3 Hierarchical Clustering 56
6.4 Density-Based Clustering 57
7 Outlier Detection 63
7.1 Univariate Outlier Detection 63
7.2 Outlier Detection with LOF 66
7.3 Outlier Detection by Clustering 70
7.4 Outlier Detection from Time Series 72
7.5 Discussions 73
8 Time Series Analysis and Mining 75
8.1 Time Series Data in R 75
8.2 Time Series Decomposition 76
8.3 Time Series Forecasting 78
8.4 Time Series Clustering 78
8.4.1 Dynamic Time Warping 79
8.4.2 Synthetic Control Chart Time Series Data 79
8.4.3 Hierarchical Clustering with Euclidean Distance 80
8.4.4 Hierarchical Clustering with DTW Distance 82
8.5 Time Series Classification 83
8.5.1 Classification with Original Data 83
8.5.2 Classification with Extracted Features 84
8.5.3 fc-NN Classification 86
8.6 Discussions 87
8.7 Further Readings 87
9 Association Rules 89
9.1 Basics of Association Rules 89
9.2 The Titanic Dataset 90
9.3 Association Rule Mining 92
9.4 Removing Redundancy 96
9.5 Interpreting Rules 98
9.6 Visualizing Association Rules 99
9.7 Discussions and Further Readings 103
10 Text Mining 105
10.1 Retrieving Text from Twitter 105
10.2 Transforming Text 106
10.3 Stemming Words 108
10.4 Building a Term-Document Matrix 110
10.5 Frequent Terms and Associations 111
10.6 WordCloud 113
10.7 Clustering Words 114
10.8 Clustering Tweets 116
Contents ix
10.8.1 Clustering Tweets with the -Means Algorithm 116
10.8.2 Clustering Tweets with the Jt-Medoids Algorithm 118
10.9 Packages, Further Readings, and Discussions 121
11 Social Network Analysis 123
11.1 Network of Terms 123
11.2 Network of Tweets 127
11.3 Two-Mode Network 132
11.4 Discussions and Further Readings 136
12 Case Study I: Analysis and Forecasting of House Price
Indices 137
12.1 Importing HPI Data 137
12.2 Exploration of HPI Data 138
12.3 Trend and Seasonal Components of HPI 145
12.4 HPI Forecasting 147
12.5 The Estimated Price of a Property 149
12.6 Discussion 149
13 Case Study II: Customer Response Prediction and Profit
Optimization 151
13.1 Introduction 151
13.2 The Data of KDD Cup 1998 151
13.3 Data Exploration 160
13.4 Training Decision Trees 166
13.5 Model Evaluation 170
13.6 Selecting the Best Tree 173
13.7 Scoring 176
13.8 Discussions and Conclusions 179
14 Case Study III: Predictive Modeling of Big Data with Limited
Memory 181
14.1 Introduction 181
14.2 Methodology 182
14.3 Data and Variables 182
14.4 Random Forest 183
14.5 Memory Issue 185
14.6 Train Models on Sample Data 186
14.7 Build Models with Selected Variables 188
14.8 Scoring 194
14.9 Print Rules 201
14.9.1 Print Rules in Text 201
14.9.2 Print Rules for Scoring with SAS 205
14.10 Conclusions and Discussion 211
x Contents
15 Online Resources 213
R Reference Cards 213
2 R 213
3 Data Mining 214
4 Data Mining with R 216
5 Ciassification/Prediction with R 216
6 Time Series Analysis with R 216
7 Association Rule Mining with R 216
8 Spatial Data Analysis with R 217
9 Text Mining with R 217
10 Social Network Analysis with R 217
11 Data Cleansing and Transformation with R 218
12 Big Data and Parallel Computing with R 218
R Reference Card for Data Mining 221
Bibliography 225
General Index 229
Package Index 231
Function Index 233
|
any_adam_object | 1 |
author | Zhao, Yanchang |
author_facet | Zhao, Yanchang |
author_role | aut |
author_sort | Zhao, Yanchang |
author_variant | y z yz |
building | Verbundindex |
bvnumber | BV040916126 |
classification_rvk | SK 850 ST 601 |
ctrlnum | (OCoLC)828134419 (DE-599)BVBBV040916126 |
dewey-full | 006.312 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.312 |
dewey-search | 006.312 |
dewey-sort | 16.312 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik Mathematik |
edition | 1. ed. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01523nam a2200397 c 4500</leader><controlfield tag="001">BV040916126</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20130913 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">130327s2013 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780123969637</subfield><subfield code="9">978-0-123-96963-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)828134419</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV040916126</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29</subfield><subfield code="a">DE-824</subfield><subfield code="a">DE-384</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.312</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 850</subfield><subfield code="0">(DE-625)143263:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 601</subfield><subfield code="0">(DE-625)143682:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zhao, Yanchang</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">R and data mining</subfield><subfield code="b">examples and case studies</subfield><subfield code="c">Yanchang Zhao</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u.a.]</subfield><subfield code="b">Elsevier</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIII, 234 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="8">1\p</subfield><subfield code="0">(DE-588)4522595-3</subfield><subfield code="a">Fallstudiensammlung</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="C">b</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025895356&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-025895356</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection> |
genre | 1\p (DE-588)4522595-3 Fallstudiensammlung gnd-content |
genre_facet | Fallstudiensammlung |
id | DE-604.BV040916126 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:35:17Z |
institution | BVB |
isbn | 9780123969637 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-025895356 |
oclc_num | 828134419 |
open_access_boolean | |
owner | DE-29 DE-824 DE-384 |
owner_facet | DE-29 DE-824 DE-384 |
physical | XIII, 234 S. graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Elsevier |
record_format | marc |
spelling | Zhao, Yanchang Verfasser aut R and data mining examples and case studies Yanchang Zhao 1. ed. Amsterdam [u.a.] Elsevier 2013 XIII, 234 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier R Programm (DE-588)4705956-4 gnd rswk-swf Data Mining (DE-588)4428654-5 gnd rswk-swf 1\p (DE-588)4522595-3 Fallstudiensammlung gnd-content Data Mining (DE-588)4428654-5 s R Programm (DE-588)4705956-4 s b DE-604 HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025895356&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk |
spellingShingle | Zhao, Yanchang R and data mining examples and case studies R Programm (DE-588)4705956-4 gnd Data Mining (DE-588)4428654-5 gnd |
subject_GND | (DE-588)4705956-4 (DE-588)4428654-5 (DE-588)4522595-3 |
title | R and data mining examples and case studies |
title_auth | R and data mining examples and case studies |
title_exact_search | R and data mining examples and case studies |
title_full | R and data mining examples and case studies Yanchang Zhao |
title_fullStr | R and data mining examples and case studies Yanchang Zhao |
title_full_unstemmed | R and data mining examples and case studies Yanchang Zhao |
title_short | R and data mining |
title_sort | r and data mining examples and case studies |
title_sub | examples and case studies |
topic | R Programm (DE-588)4705956-4 gnd Data Mining (DE-588)4428654-5 gnd |
topic_facet | R Programm Data Mining Fallstudiensammlung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025895356&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT zhaoyanchang randdataminingexamplesandcasestudies |