Practical data science with R:
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Shelter Island
Manning
2019
|
Ausgabe: | 2. ed. |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Hier auch später erschienene, unveränderte Nachdrucke |
Beschreibung: | XXVII, 536 Seiten Illustrationen, Diagramme 24 cm |
ISBN: | 9781617295874 1617295876 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV045898315 | ||
003 | DE-604 | ||
005 | 20200708 | ||
007 | t | ||
008 | 190524s2019 xxua||| |||| 00||| eng d | ||
020 | |a 9781617295874 |c pbk. : No price |9 978-1-61729-587-4 | ||
020 | |a 1617295876 |9 1-61729-587-6 | ||
024 | 3 | |a 978-1-61729-587-4 | |
035 | |a (OCoLC)1136233655 | ||
035 | |a (DE-599)BVBBV045898315 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
044 | |a xxu |c XD-US | ||
049 | |a DE-739 |a DE-898 |a DE-573 |a DE-1043 | ||
082 | 0 | |a 006.312 | |
084 | |a ST 601 |0 (DE-625)143682: |2 rvk | ||
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
100 | 1 | |a Zumel, Nina |e Verfasser |0 (DE-588)1055925899 |4 aut | |
245 | 1 | 0 | |a Practical data science with R |c Nina Zumel; John Mount |
250 | |a 2. ed. | ||
264 | 1 | |a Shelter Island |b Manning |c 2019 | |
300 | |a XXVII, 536 Seiten |b Illustrationen, Diagramme |c 24 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Hier auch später erschienene, unveränderte Nachdrucke | ||
650 | 0 | 7 | |a R |g Programm |0 (DE-588)4705956-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Statistik |0 (DE-588)4056995-0 |2 gnd |9 rswk-swf |
653 | 0 | |a Data mining | |
653 | 0 | |a R (Computer program language) | |
689 | 0 | 0 | |a R |g Programm |0 (DE-588)4705956-4 |D s |
689 | 0 | 1 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 2 | |a Statistik |0 (DE-588)4056995-0 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Mount, John |e Verfasser |0 (DE-588)1202632769 |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031281191&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-031281191 |
Datensatz im Suchindex
_version_ | 1804180060008087552 |
---|---|
adam_text | contents foreword xv pmface xvi acknowledgments xvii about this book xviii about the authors xxv about the foreword authors about the cover illustration xxvi xxvii The data science process 1.1 The roles in a data science project Project roles 1.2 3 4 4 Stages of a data science project 6 Defining the goal 7 ■ Data collection and management 8 Modeling 10’ Model evaluation and critique 12 Presentation and documentation 14 ■ Model deployment and maintenance 15 1.3 Setting expectations 16 Determining lower bounds on model performance Starting with R and data 2.1 Starting with R 16 18 19 Installing R, tools, and examples IX 20 ■ R programming 20
2.2 Working with data from files 29 Working with well-structured, data from files or URLs Using R with less-structured data 34 2.3 Working with relational databases A production-size example Exploring data 3.1 29 Choosing and evaluating models 163 37 6.1 38 51 Using summary statistics to spot problems 53 6.2 54 Spotting problems using graphics and visualization 4.1 6.3 88 Cleaning data Data transformations 4.3 Sampling for modeling and validation 7.1 Data engineering and data shaping 5.1 Data selection 107 111 7.2 113 Basic data transforms Adding new columns 5.3 128 128· Other simple operations Aggregating transforms Multitable data transforms Reshaping transforms 216 Using logistic regression 237 Regularization 257 An example of quasi-separation 257· The types of regularized regression 262 · Regularized reg ession with glmnet 263 134 137 Combining two or more ordered data frames quickly 137 Principal methods to combine data from multiple tables 143 5.5 7.3 134 Combining many rows into summary rows 5.4 133 Using linear regression 215 Understanding logistic regression 237 · Building a logistic regression model 242 · Making predictions 243 Finding rela tions and extracting advice from logistic models 248 · Reading the model summary and characterizing coefficients 249 · Logistic regression takeaways 256 116 Subsetting rows and columns 116· Removing records with incomplete data 121 · Ordering rows 124 5.2 Local interpretable model-agnostic explanations (LIME) for explaining model predictions 195 Understanding linear regression 217 · Building a linear regression model 221 ·
Making predictions 222 Finding relations and extracting advice 228 · Reading the model summary and characterizing coefficient quality 230 Linear regression takeaways 237 104 Test and training splits 108· Creating a sample group column 109· Record grouping 110 · Data provenance 170 Linear and logistic regression 98 Normalization 99 · Centering and scaling 101 Log transformations for skewed and wide distributions 166 LIME: Automated sanity checking 197 · Walking through LIME: A small example 197 · LIME for text classification 204 Training the text classifier 208 · Explaining the classifier’s predictions 209 88 Domain-specific data cleaning 89 · Treating missing values 91 · The vtreat package for automatically treating missing variables 95 4.2 Evaluating models 164 Overfitting 170· Measures of model performance 174 Evaluating classification models 175 · Evaluating scoring models 185 · Evaluating probability models 187 58 Visually checking distributions for a single variable 60 Visually checking relationships between two variables 70 Managing data Mapping problems to machine learning tasks Classification problems 165 · Scoring probkms Grouping: working without known targets 167 Problem-to-method mapping 169 Typical problems revealed by data summaries 3.2 XI CONTENTS CONTENTS 149 Moving data from wide to tall form 149 · Moving data from tall to wide form 153 · Data coordinates 158 Advanced data preparation 2 74 8.1 The purpose of the vtreat package 8.2 KDD and KDD Cup 2009 275 277 Getting started with KDD Cup 2009 data the-china-shop approach 280 278 · The bull-in-
Xli 8.3 Basic data preparation for classification The variable score frame plan 288 8.4 282 284 · Properly using the treatment Documentation and deployment Advanced data preparation for classification Using mkCrossFrameCExperiment() model 292 8.5 8.6 290 11.1 11.2 290 · Building a Preparing data for regression modeling Mastering the vtreat package 299 9.1 Cluster analysis 11.3 Association rules 311 11.4 312 10.1 Tree-based methods 342 353 355 A basic decision tree 356 · Using bagging to improve prediction 359 · Using random forests to further improve prediction 361 · Gradient-boosted trees 368 · Tree-based model takeaways 376 10.2 Deploying models 428 Solving “inseparable” problems using support vector machines 389 Using an SVM to solve a problem 390 · Understanding support vector machines 395 · Understanding kernel functions 397 Support vector machine and kernel methods takeaways 399 43 7 Presenting your results to the project sponsor 439 Summarizing the project’s goals 440 · Stating the project’s results 442 · Filling in the details 444 · Making recommendations and discussing future work 446 Project sponsor presentation takeaways 446 12.2 Presenting your model to end users 447 Summarizing the project goals 447 · Showing how the model fits user workflow 448 · Showing how to use the model 450 · End user presentation takeaways 452 12.3 Presenting your work to other data scientists 452 Introducing the problem 452 · Discussing related work 453 Discussing your approach 454 · Discussing results and future work 455 · Peer presentation takeaways 457 Using generalized additive
models (GAMs) to learn non-monotone relationships 376 Understanding GAMs 376 · A one-dimensional regression example 378 · Extracting the non-linear relationships 382 Using GAM on actual data 384 · Using GAMfor logistic regression 387 · GAM takeaways 388 10.3 Using comments and version control for running documentation 414 Producing effective presentations 12.1 Exploring advanced methods 411 Deploying demonstrations using Shiny 430 · Deploying models as HTTP services 431 · Deploying models by export 433 · What to take away 435 340 Overview of association rules 340 · The example problem Mining association rules with the arules package 343 Association rule takeaways 351 Predicting buzz 405 Using R markdown to produce milestone documentation Writing effective comments 414· Using version control to record history 416 · Using version control to explore your project 422 · Using version control to share work 424 Distances 313 · Preparing the data 316 · Hierarchical clustering with hclust 319 · The k-means algorithm 332 Assigning new points to clusters 338 · Clustering takeaways 340 9.2 403 What is R markdown? 407 · knitr technical details 409 Using knitr to document the Buzz data and produce the model 297 The vtreat phases 299 · Missing values 301 Indicator variables 303 · Impact coding 304 The treatment plan 305 · The cross-frame 306 Unsupervised methods xiii CONTENTS CONTENTS appendix A appendix В appendix C Starling with R and other tools 459 Important statistical concepts 484 Bibliography 519 index 523 406
|
any_adam_object | 1 |
author | Zumel, Nina Mount, John |
author_GND | (DE-588)1055925899 (DE-588)1202632769 |
author_facet | Zumel, Nina Mount, John |
author_role | aut aut |
author_sort | Zumel, Nina |
author_variant | n z nz j m jm |
building | Verbundindex |
bvnumber | BV045898315 |
classification_rvk | ST 601 ST 250 |
ctrlnum | (OCoLC)1136233655 (DE-599)BVBBV045898315 |
dewey-full | 006.312 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.312 |
dewey-search | 006.312 |
dewey-sort | 16.312 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | 2. ed. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01884nam a2200481 c 4500</leader><controlfield tag="001">BV045898315</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20200708 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">190524s2019 xxua||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781617295874</subfield><subfield code="c">pbk. : No price</subfield><subfield code="9">978-1-61729-587-4</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1617295876</subfield><subfield code="9">1-61729-587-6</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">978-1-61729-587-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1136233655</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045898315</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">xxu</subfield><subfield code="c">XD-US</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-898</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-1043</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.312</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 601</subfield><subfield code="0">(DE-625)143682:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zumel, Nina</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1055925899</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Practical data science with R</subfield><subfield code="c">Nina Zumel; John Mount</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">2. ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Shelter Island</subfield><subfield code="b">Manning</subfield><subfield code="c">2019</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXVII, 536 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield><subfield code="c">24 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Hier auch später erschienene, unveränderte Nachdrucke</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Data mining</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">R (Computer program language)</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mount, John</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1202632769</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031281191&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-031281191</subfield></datafield></record></collection> |
id | DE-604.BV045898315 |
illustrated | Illustrated |
indexdate | 2024-07-10T08:29:45Z |
institution | BVB |
isbn | 9781617295874 1617295876 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-031281191 |
oclc_num | 1136233655 |
open_access_boolean | |
owner | DE-739 DE-898 DE-BY-UBR DE-573 DE-1043 |
owner_facet | DE-739 DE-898 DE-BY-UBR DE-573 DE-1043 |
physical | XXVII, 536 Seiten Illustrationen, Diagramme 24 cm |
publishDate | 2019 |
publishDateSearch | 2019 |
publishDateSort | 2019 |
publisher | Manning |
record_format | marc |
spelling | Zumel, Nina Verfasser (DE-588)1055925899 aut Practical data science with R Nina Zumel; John Mount 2. ed. Shelter Island Manning 2019 XXVII, 536 Seiten Illustrationen, Diagramme 24 cm txt rdacontent n rdamedia nc rdacarrier Hier auch später erschienene, unveränderte Nachdrucke R Programm (DE-588)4705956-4 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Statistik (DE-588)4056995-0 gnd rswk-swf Data mining R (Computer program language) R Programm (DE-588)4705956-4 s Datenanalyse (DE-588)4123037-1 s Statistik (DE-588)4056995-0 s DE-604 Mount, John Verfasser (DE-588)1202632769 aut Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031281191&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Zumel, Nina Mount, John Practical data science with R R Programm (DE-588)4705956-4 gnd Datenanalyse (DE-588)4123037-1 gnd Statistik (DE-588)4056995-0 gnd |
subject_GND | (DE-588)4705956-4 (DE-588)4123037-1 (DE-588)4056995-0 |
title | Practical data science with R |
title_auth | Practical data science with R |
title_exact_search | Practical data science with R |
title_full | Practical data science with R Nina Zumel; John Mount |
title_fullStr | Practical data science with R Nina Zumel; John Mount |
title_full_unstemmed | Practical data science with R Nina Zumel; John Mount |
title_short | Practical data science with R |
title_sort | practical data science with r |
topic | R Programm (DE-588)4705956-4 gnd Datenanalyse (DE-588)4123037-1 gnd Statistik (DE-588)4056995-0 gnd |
topic_facet | R Programm Datenanalyse Statistik |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031281191&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT zumelnina practicaldatasciencewithr AT mountjohn practicaldatasciencewithr |