Exploring data with RapidMiner: explore, understand, and prepare real data using RapidMiner's practical tips and tricks
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Birmingham [u.a.]
Packt Publ.
2013
|
Ausgabe: | 1. publ. |
Schriftenreihe: | Community experience distilled
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | IV, 148 S. Ill., graph. Darst. |
ISBN: | 9781782169338 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV041803622 | ||
003 | DE-604 | ||
005 | 20140505 | ||
007 | t | ||
008 | 140415s2013 ad|| |||| 00||| eng d | ||
016 | 7 | |a 775657948 |2 DE-101 | |
020 | |a 9781782169338 |9 978-1-78216-933-8 | ||
024 | 3 | |a 9781782169338 | |
035 | |a (OCoLC)868306800 | ||
035 | |a (DE-599)HBZHT018115700 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-N32 |a DE-473 |a DE-B768 | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
100 | 1 | |a Chisholm, Andrew |d 1959- |e Verfasser |0 (DE-588)171063112 |4 aut | |
245 | 1 | 0 | |a Exploring data with RapidMiner |b explore, understand, and prepare real data using RapidMiner's practical tips and tricks |c Andrew Chisholm |
246 | 1 | 0 | |a rapid miner |
250 | |a 1. publ. | ||
264 | 1 | |a Birmingham [u.a.] |b Packt Publ. |c 2013 | |
300 | |a IV, 148 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Community experience distilled | |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027249118&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-027249118 |
Datensatz im Suchindex
_version_ | 1804152125283893248 |
---|---|
adam_text | Table
of Contents
Preface
Chapter
1:
Setting the Scene
A process framework
8
Data volume and velocity
10
Data variety, formats, and meanings
11
Missing data
12
Cleaning data
12
Visualizing data
13
Resource constraints
13
Terminology
14
Accompanying material
15
Summary
16
Chapter
2:
Loading Data
_________________________________________17
Reading files
17
Alternative delimiters
20
Reading complete lines
21
Reading large numbers of attributes
21
Splitting files into smaller pieces
23
Databases
25
The Read Database operator
25
Large
datasets
27
Using macros
27
Summary
28
Table
of
Contents
Chapter
3:
Visualizing Data
___________________
Getting started
f
Statistical summaries
τ
Relationships between attributes
32
Scatter plots
J*
Scatter
3D
color ^
Parallel and deviation
35
Quartile color
38
Time series data
39
Plotting series
39
Using the survey plotter 42
Relations between examples 43
Using histograms ^
Using block plots 45
Summary *7
Chapter
4:
Parsing and Converting Attributes
_____________ 49
Generating attributes 50
Date functions
51
Regular expression functions
53
Generating extracts
54
Regular expressions
54
XPath
57
Renaming attributes
59
Searching and replacing attribute values
59
Using the Map operator
59
Using the Replace operator
60
Using the Replace (Dictionary) operator
60
Summary
62
Chapter
5:
Outliers
___________________________________________ 63
Manual inspection
63
Increasing the data volume
68
Rules for handling outliers
68
Automated detection of example outliers
69
The Detect Outlier (Distances) operator
69
The Detect Outlier (Densities) operator
73
The Detect Outlier
(LOF)
operator
74
The Detect Outliers (COF) operator
75
Summary
76
Table
ϋί
Contents
Chapter
6:
Missing Values
_____________________________________77
Missing or empty?
77
Types of missing data
78
Missing completely at random
78
Missing at random
78
Not missing at random
79
Categorizing missing data
79
Finding MCAR data
83
Finding MAR data
85
Finding NMAR data
86
A cautionary note
87
Effect of missing data
88
Options for handling missing data
88
Returning to the root cause
89
Ignoring it
89
Manual editing
89
Deletion of examples
90
Deletion of attributes
90
Imputation with single values
90
Modeling
91
Summary
91
Chapter
7:
Transforming Data
____________________________________93
Creating new attributes
94
Aggregation
98
Using pivoting
100
Using de-pivoting
Ю2
Summary
Ю6
Chapter
8:
Reducing Data Size
__________________________________107
Removing examples using sampling
107
Removing attributes
Ю8
Removing useless attributes
Ю9
Weighting attributes 111
Selecting attributes using models
114
Summary
119
[Ml]
Table
of
Contents
Chapter
9:
Resource Constraints
_________________________________121
Measuring and estimating performance
121
Measuring performance
122
Adding memory
129
Parallel processing
130
Restructuring processes
131
Summary
131
Chapter
10:
Debugging
____________________________________ 133
Breakpoints in RapidMiner Studio
133
Logging data in RapidMiner Studio
134
RapidMiner Studio console printing
135
Groovy scripts
136
Outputting macros example
137
Console logging with Groovy
137
Regex tools
138
Using XPath effectively
138
Summary
139
Chapter
11:
Taking Stock
_________________________________________141
Exploring new techniques
142
Time series
142
Web mining
142
Using
R
142
Java or Groovy
142
Third-party components
143
RapidMiner Server
143
Where to go next
143
Index
145
|
any_adam_object | 1 |
author | Chisholm, Andrew 1959- |
author_GND | (DE-588)171063112 |
author_facet | Chisholm, Andrew 1959- |
author_role | aut |
author_sort | Chisholm, Andrew 1959- |
author_variant | a c ac |
building | Verbundindex |
bvnumber | BV041803622 |
classification_rvk | ST 530 |
ctrlnum | (OCoLC)868306800 (DE-599)HBZHT018115700 |
discipline | Informatik |
edition | 1. publ. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01472nam a2200373 c 4500</leader><controlfield tag="001">BV041803622</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20140505 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">140415s2013 ad|| |||| 00||| eng d</controlfield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">775657948</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781782169338</subfield><subfield code="9">978-1-78216-933-8</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9781782169338</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)868306800</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HBZHT018115700</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-N32</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-B768</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chisholm, Andrew</subfield><subfield code="d">1959-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)171063112</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Exploring data with RapidMiner</subfield><subfield code="b">explore, understand, and prepare real data using RapidMiner's practical tips and tricks</subfield><subfield code="c">Andrew Chisholm</subfield></datafield><datafield tag="246" ind1="1" ind2="0"><subfield code="a">rapid miner</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. publ.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham [u.a.]</subfield><subfield code="b">Packt Publ.</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">IV, 148 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Community experience distilled</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027249118&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027249118</subfield></datafield></record></collection> |
id | DE-604.BV041803622 |
illustrated | Illustrated |
indexdate | 2024-07-10T01:05:44Z |
institution | BVB |
isbn | 9781782169338 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027249118 |
oclc_num | 868306800 |
open_access_boolean | |
owner | DE-N32 DE-473 DE-BY-UBG DE-B768 |
owner_facet | DE-N32 DE-473 DE-BY-UBG DE-B768 |
physical | IV, 148 S. Ill., graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Packt Publ. |
record_format | marc |
series2 | Community experience distilled |
spelling | Chisholm, Andrew 1959- Verfasser (DE-588)171063112 aut Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks Andrew Chisholm rapid miner 1. publ. Birmingham [u.a.] Packt Publ. 2013 IV, 148 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Community experience distilled Data Mining (DE-588)4428654-5 gnd rswk-swf Data Mining (DE-588)4428654-5 s DE-604 Digitalisierung UB Bamberg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027249118&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Chisholm, Andrew 1959- Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks Data Mining (DE-588)4428654-5 gnd |
subject_GND | (DE-588)4428654-5 |
title | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks |
title_alt | rapid miner |
title_auth | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks |
title_exact_search | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks |
title_full | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks Andrew Chisholm |
title_fullStr | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks Andrew Chisholm |
title_full_unstemmed | Exploring data with RapidMiner explore, understand, and prepare real data using RapidMiner's practical tips and tricks Andrew Chisholm |
title_short | Exploring data with RapidMiner |
title_sort | exploring data with rapidminer explore understand and prepare real data using rapidminer s practical tips and tricks |
title_sub | explore, understand, and prepare real data using RapidMiner's practical tips and tricks |
topic | Data Mining (DE-588)4428654-5 gnd |
topic_facet | Data Mining |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027249118&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT chisholmandrew exploringdatawithrapidminerexploreunderstandandpreparerealdatausingrapidminerspracticaltipsandtricks AT chisholmandrew rapidminer |