R in action: data analysis and graphics with R and tidyverse
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Shelter Island
Manning
[2022]
|
Ausgabe: | Third edition |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xxxi, 622 Seiten Illustrationen, Diagramme 24 cm |
ISBN: | 9781617296055 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047471764 | ||
003 | DE-604 | ||
005 | 20230310 | ||
007 | t | ||
008 | 210916s2022 xxua||| |||| 00||| eng d | ||
020 | |a 9781617296055 |q paperback |9 978-1-61729-605-5 | ||
035 | |a (OCoLC)1334027548 | ||
035 | |a (DE-599)BVBBV047471764 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
044 | |a xxu |c US | ||
049 | |a DE-11 |a DE-473 |a DE-1050 |a DE-M49 | ||
082 | 0 | |a 519.502855133 | |
084 | |a CM 3000 |0 (DE-625)18945: |2 rvk | ||
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
084 | |a ST 601 |0 (DE-625)143682: |2 rvk | ||
084 | |a WC 7000 |0 (DE-625)148142: |2 rvk | ||
084 | |a MR 2200 |0 (DE-625)123489: |2 rvk | ||
084 | |a DAT 307 |2 stub | ||
084 | |a DAT 754 |2 stub | ||
084 | |a MAT 620 |2 stub | ||
100 | 1 | |a Kabacoff, Robert |e Verfasser |0 (DE-588)14294372X |4 aut | |
245 | 1 | 0 | |a R in action |b data analysis and graphics with R and tidyverse |c Robert I. Kabacoff |
250 | |a Third edition | ||
264 | 1 | |a Shelter Island |b Manning |c [2022] | |
264 | 4 | |c © 2022 | |
300 | |a xxxi, 622 Seiten |b Illustrationen, Diagramme |c 24 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Statistik |0 (DE-588)4056995-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a R |g Programm |0 (DE-588)4705956-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a R |g Programm |0 (DE-588)4705956-4 |D s |
689 | 0 | 1 | |a Statistik |0 (DE-588)4056995-0 |D s |
689 | 0 | 2 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032873412&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-032873412 |
Datensatz im Suchindex
_version_ | 1804182783951634432 |
---|---|
adam_text | contents preface xix acknowledgments xxi about this book xxiii about the author xxx about the cover illustration Part 1 xxxi Getting started................................................ 1 Introduction to R 3 1.1 Why use R? 1.2 Obtaining and installing R 1.3 Working with R 5 1 7 Getting started 8 ■ Using RStudio The workspace 13 ■ Projects 14 1.4 Packages 10 · Getting help 15 What are packages? 15 ■ Installing a package Loadingapackage 16 ■ Learning about a package 16 1.5 Using output as input: Reusing results 1.6 Working with large datasets 1.7 Working through an example ix 18 18 17 15 12
CONTENTS x P Creating a dataset 20 2.1 Understanding datasets 2.2 Data structures 22 Vectors 23 ■ Matrices Factors 2.3 28 ■ Lists Data input 21 25 ■ Data frames 23 · Arrays 26 31 30 · Tibbles 33 Entering data from the keyboard 34 ■ Importing data from a delimited textfile 35 ■ Importing data from Excel 39 Importing data from JSON 39 · Importing data from the web 39 · Importing data from SPSS 40 ■ Importing data from SAS 40 ■ Importing data from Stata 41 Accessing database management systems 41 ■ Importing data via Stat/Transfer 42 2.4 Annotating datasets Variable labels 2.5 43 43 ■ Value labels 44 Useful functions for working with data objects 44 Չ Basic data management 46 3.1 A working example 3.2 Creating new variables 3.3 Recoding variables 3.4 Renaming variables 3.5 Missing values 47 50 51 52 Recoding values to missing values from analyses 53 3.6 Date values 48 53 ■ Excluding missing 54 Converting dates to character variables further 56 3.7 Type conversions 3.8 Sorting data 3.9 Merging datasets 56 ■ Going 56 57 58 Adding columns to a data frame 58 » Adding rows to a data frame 58 3.10 Subsetting datasets 59 Selecting variables 59 ■ Dropping variables 59 · Selecting observations 60՝ The subset() function 61 ■ Random samples 62
CONTENTS 3.11 Using dplyr to manipulate data frames Using SQL statements to manipulate data frames Getting started with graphs 4.1 69 ■ Geoms 78 ■ Labels ggplot2 details 66 68 Creating a graph with ggplot2 ggplot Facets 4.2 62 62 ■ Using pipe operators to chain Basic dplyrfunctions statements 65 3.12 ХІ 69 70 ■ Grouping 74 ■ Seaks 80 ■ Themes 80 76 82 Placing the data and mapping options 82 · Graphs as objects 84 · Saving graphs 85 ■ Common mistakes 86 Advanced data management 88 5.1 A data management challenge 5.2 Numerical and character functions 89 90 Mathematical functions 90 ■ Statisticalfunctions 91 Probability functions 93 · Characterfunctions 96 Other usefulfunctions 98 ■ Applyingfunctions to matrices and data frames 99 · A solution for the data management challenge 100 5.3 Control flow 104 Repetition and looping 5.4 User-written functions 5.5 Reshaping data 104՝ Conditional execution 105 106 109 Transposing 109· Convertingfrom wide to long dataset formats 109 5.6 Part 2 Aggregating data Basic methods Basic graphs 6.1 0000000000 117 Bar charts 118 Simple bar charts 118 ■ Stacked, grouped, and filled bar charts 119· Mean bar charts 121 ■ Tweaking bar charts 6.2 112 123 Pie charts 128 115
CONTENTS xii 6.3 Tree maps 6.4 Histograms 6.5 Kernel density plots 6.6 Box plots 138 Using parallel box plots to compare groups plots 142 6.7 ζ7 Dot plots Basic statistics 7.1 130 133 135 139 ■ Violin 143 147 Descriptive statistics 148 A menagerie of methods 148 · Even more methods 150 Descriptive statistics by group 152 · Summarizing data interactively with dplyr 154 ■ Visualizing results 155 7.2 Frequency and contingency tables 156 Generatingfrequency tables 156՝ Tests of independence Measures of association 163· Visualizing results 164 7.3 Correlations 162 164 Types of correlations 165 ■ Testing correlations for significance 167 · Visualizing correlations 169 7.4 T-tests 169 Independent t-test 169 · Dependent t-test When there are more than two groups 171 7.5 Nonparametric tests of group differences Comparing two groups two groups 173 7.6 Part 3 170 171 171 ■ Comparing more than Visualizing group differences 175 Intermediate methods.................................. 177 Regression 8.1 179 The many faces of regression Scenarios for using OLS regression know 182 8.2 OLS regression 180 181 ■ What you need to 183 Fitting regression models with lm() 184 ■ Simple linear regression 185 ■ Polynomial regression 188 · Multiple linear regression 190 · Multiple linear regression with interactions 192
xiii CONTENTS 8.3 Regression diagnostics 194 A typical approach 195 · An enhanced approach Multicollinearity 202 8.4 Unusual observations 203 Outliers 203 ■ High-leverage points observations 204 8.5 Corrective measures 197 203 ■ Influential 207 Deleting observations 208 · Transforming variables 208 Adding or deleting variables 210· Trying a different approach 210 8.6 Selecting the “best” regression model Comparing models 8.7 211 ■ Variable selection Taking the analysis further Cross-validation 211 212 215 215 ■ Relative importance 217 Analysis of variance 221 9.1 A crash course on terminology 9.2 Fitting ANOVA models One-way ANOVA One-way ANCOVA 228 ■ Assessing test assumptions 235 · Visualizing the 9.5 Two-way factorial ANOVA 9.6 Repeated measures ANOVA 9.7 Multivariate analysis of variance (MANOVA) Assessing test assumptions ANOVA as regression ļ fj Power analysis 232 233 Assessing test assumptions results 236 9.8 225 226 Multiple comparisons 9.4 224 224 · The order offormula terms The aov() function 9.3 222 237 239 242 244 ■ Robust MANOVA 245 246 249 10.1 A quick review of hypothesis testing 10.2 Implementing power analysis with the pwr package 250 T-tests 253 ■ ANOVA 255 ■ Correlations 255 Linear models 256 · Tests ofproportions 257 Chi-square tests 258 · Choosing an appropriate effect size in novel situations 259 252
CONTENTS xiv 10.3 Creating power analysis plots 10.4 Other packages 262 263 265 Intermediate graphs 11.1 Scatter plots 266 Scatter plot matrices 269 ■ High-density scatter plots 272 3D scatter plots 275 ■ Spinning 3D scatter plots 277 Bubble plots 279 11.2 Line charts 282 11.3 Corrgrams 284 11.4 Mosaic plots 289 Resampling statistics and bootstrapping 293 12.1 Permutation tests 12.2 Permutation tests with the coin package 294 296 Independent two-sample and k-sample tests 297 ■ Independence in contingency tables 298 ■ Independence between numeric variables 299 · Dependent two-sample and k-sample tests 300 Goingfurther 300 12.3 Permutation tests with the ImPerm package Simple and polynomial regression One-way ANOVA and ANCOVA 301 ■ Multiple regression 302 303 ■ Two-way ANOVA 304 12.4 Additional comments on permutation tests 12.5 Bootstrapping 12.6 Bootstrapping with the boot package 304 305 Bootstrapping a single statistic statistics 309 Part 4 300 306 307 ■ Bootstrapping several 313 Advanced methods Generalized linear models 13.1 315 Generalized linear models and the glm() function The glm() function 317· Supportingfunctions Modelfit and regression diagnostics 319 13.2 Logistic regression 318 320 Interpreting the model parameters 323 ■ Assessing the impact ofpredictors on the probability of an outcome 323 Overdispersion 324 ■ Extensions 325 316
CONTENTS 13.3 Poisson regression XV 326 Interpreting the model parameters Extensions 331 328· Overdispersion 329 Principal components andfactor analysis 333 14.1 Principal components and factor analysis in R 14.2 Principal components 335 336 Selecting the number of components to extract 337 Extracting principal components 338 ■ Rotating principal components 342 · Obtainingprincipal component scores 343 14.3 Exploratory factor analysis 345 Deciding how many common factors to extract 346 Extracting common factors 347 · Rotatingfactors 348 Factor scores 352 ■ Other EFA-related packages 352 14.4 Other latent variable models Time series 352 355 15.1 Creating a time-series object in R 15.2 Smoothing and seasonal decomposition 358 Smoothing with simple moving averages decomposition 362 15.3 Exponential forecasting models 360 360 · Seasonal 368 Simple exponential smoothing 369 ■ Holt and Holt-Winters exponential smoothing 372 · The ets()function and automatedforecasting 374 15.4 ARIMA forecasting models 376 Prerequisite concepts 376 · ARMA and ARIMA models Automated ARIMA forecasting 383 15.5 Going further Cluster analysis 384 386 16.1 Common steps in cluster analysis 16.2 Calculating distances 16.3 Hierarchical cluster analysis 391 16.4 Partitioning-cluster analysis 396 388 390 К-means clustering 396 ■ Partitioning around medoids 403 378
CONTENTS 16.5 Avoiding nonexistent clusters 404 16.6 Going further 408 Classification 409 17.1 17.2 17.3 Preparing the data 410 Logistic regression 412 Decision trees 413 Classical decision trees 413 ■ Conditional inference trees 417 17.4 17.5 Random forests 418 Support vector machines 421 Tuning an SVM 423 Choosing a best predictive solution 425 Understanding black box predictions 428 Break-down plots 428 Plotting Shapley values 431 17.6 17.7 17.8 Going further 432 Advanced methodsfor missing data 434 18.1 18.2 18.3 Steps in dealing with missing data 435 Identifying missing values 437 Exploring missing-values patterns 438 Visualizing missing values 439 ■ Using correlations to explore missing values 442 18.4 Understanding the sources and impact of missing data 444 Rational approaches for dealing with incomplete data 445 Deleting missing data 446 Complete-case analysis (listwise deletion) 446 · Available case analysis (pairwise deletion) 448 18.5 18.6 18.7 Single imputation 448 Simple imputation 449 ■ К-nearest neighbor imputation 449 missForest 450 18.8 18.9 Multiple imputation 451 Other approaches to missing data 455
xvii CONTENTS Part 5 Expanding your skills .................................. 457 Advanced graphs 459 19.1 Modifying scales 460 Customizing axes 460 ■ Customizing colors 466 19.2 Modifying themes 470 Prepackaged themes 471 ■ Customizingfonts 472 Customizing legends 475 ■ Customizing the plot area 477 19.3 19.4 19.5 Adding annotations 478 Combining graphs 485 Making graphs interactive 487 Advanced programming 491 21 20.1 A review of the language 492 Data types 492 ■ Control structures 498 ■ Creating functions 501 20.2 20.3 20.4 Working with environments 503 Non-standard evaluation 505 Object-oriented programming 508 Generic functions 508 · Limitations of the S3 model 510 20.5 Writing efficient code 510 Efficient data input 510· Vectorization 511 · Correctly sizing objects 512 ■ Parallelization 512 20.6 Debugging 514 Common sources of errors 514 ■ Debugging tools 515 Session options that support debugging 518· UsingRStudio’s visual debugger 521 20.7 Going further 523 Creating dynamic reports 525 21.1 21.2 21.3 A template approach to reports 528 Creating a report with R and R Markdown Creating a report with R and LaTeX 534 Creating a parameterized report 536 21.4 21.5 Avoiding common R Markdown problems 540 Going further 541 529
xviii CONTENTS Q Creating a package 543 22.1 The edatools package 22.2 Creating a package 544 546 Installing development took 546 ■ Creating a package project 547 ■ Writing the packagefunctions 547 Addingfunction documentation 552 ■ Adding a general help file (optional) 554 ■ Adding sample data to the package (optional) 555 · Adding a vignette (optional) 556 Editing the DESCRIPTIONfile 557 ■ Building and installing the package 55 8 22.3 562 Sharing your package Distributing a source packagefile 562 ■ Submitting to CRAN 562 ■ Hosting on GitHub 563 ■ Creating a package website 565 22.4 afterword Going further 567 Into the rabbit hole 568 appendix A Graphical user interfaces appendix В Customizing the startup environment appendixC Exporting data from R appendix D Matrix algebra in R appendix E Packages used in this book appendix F Working with large datasets 587 appendix G Updating an R installation 592 references 595 index 599 5 71 577 579 581 574
|
adam_txt |
contents preface xix acknowledgments xxi about this book xxiii about the author xxx about the cover illustration Part 1 xxxi Getting started. 1 Introduction to R 3 1.1 Why use R? 1.2 Obtaining and installing R 1.3 Working with R 5 1 7 Getting started 8 ■ Using RStudio The workspace 13 ■ Projects 14 1.4 Packages 10 · Getting help 15 What are packages? 15 ■ Installing a package Loadingapackage 16 ■ Learning about a package 16 1.5 Using output as input: Reusing results 1.6 Working with large datasets 1.7 Working through an example ix 18 18 17 15 12
CONTENTS x P Creating a dataset 20 2.1 Understanding datasets 2.2 Data structures 22 Vectors 23 ■ Matrices Factors 2.3 28 ■ Lists Data input 21 25 ■ Data frames 23 · Arrays 26 31 30 · Tibbles 33 Entering data from the keyboard 34 ■ Importing data from a delimited textfile 35 ■ Importing data from Excel 39 Importing data from JSON 39 · Importing data from the web 39 · Importing data from SPSS 40 ■ Importing data from SAS 40 ■ Importing data from Stata 41 Accessing database management systems 41 ■ Importing data via Stat/Transfer 42 2.4 Annotating datasets Variable labels 2.5 43 43 ■ Value labels 44 Useful functions for working with data objects 44 Չ Basic data management 46 3.1 A working example 3.2 Creating new variables 3.3 Recoding variables 3.4 Renaming variables 3.5 Missing values 47 50 51 52 Recoding values to missing values from analyses 53 3.6 Date values 48 53 ■ Excluding missing 54 Converting dates to character variables further 56 3.7 Type conversions 3.8 Sorting data 3.9 Merging datasets 56 ■ Going 56 57 58 Adding columns to a data frame 58 » Adding rows to a data frame 58 3.10 Subsetting datasets 59 Selecting variables 59 ■ Dropping variables 59 · Selecting observations 60՝ The subset() function 61 ■ Random samples 62
CONTENTS 3.11 Using dplyr to manipulate data frames Using SQL statements to manipulate data frames Getting started with graphs 4.1 69 ■ Geoms 78 ■ Labels ggplot2 details 66 68 Creating a graph with ggplot2 ggplot Facets 4.2 62 62 ■ Using pipe operators to chain Basic dplyrfunctions statements 65 3.12 ХІ 69 70 ■ Grouping 74 ■ Seaks 80 ■ Themes 80 76 82 Placing the data and mapping options 82 · Graphs as objects 84 · Saving graphs 85 ■ Common mistakes 86 Advanced data management 88 5.1 A data management challenge 5.2 Numerical and character functions 89 90 Mathematical functions 90 ■ Statisticalfunctions 91 Probability functions 93 · Characterfunctions 96 Other usefulfunctions 98 ■ Applyingfunctions to matrices and data frames 99 · A solution for the data management challenge 100 5.3 Control flow 104 Repetition and looping 5.4 User-written functions 5.5 Reshaping data 104՝ Conditional execution 105 106 109 Transposing 109· Convertingfrom wide to long dataset formats 109 5.6 Part 2 Aggregating data Basic methods Basic graphs 6.1 0000000000 117 Bar charts 118 Simple bar charts 118 ■ Stacked, grouped, and filled bar charts 119· Mean bar charts 121 ■ Tweaking bar charts 6.2 112 123 Pie charts 128 115
CONTENTS xii 6.3 Tree maps 6.4 Histograms 6.5 Kernel density plots 6.6 Box plots 138 Using parallel box plots to compare groups plots 142 6.7 ζ7 Dot plots Basic statistics 7.1 130 133 135 139 ■ Violin 143 147 Descriptive statistics 148 A menagerie of methods 148 · Even more methods 150 Descriptive statistics by group 152 · Summarizing data interactively with dplyr 154 ■ Visualizing results 155 7.2 Frequency and contingency tables 156 Generatingfrequency tables 156՝ Tests of independence Measures of association 163· Visualizing results 164 7.3 Correlations 162 164 Types of correlations 165 ■ Testing correlations for significance 167 · Visualizing correlations 169 7.4 T-tests 169 Independent t-test 169 · Dependent t-test When there are more than two groups 171 7.5 Nonparametric tests of group differences Comparing two groups two groups 173 7.6 Part 3 170 171 171 ■ Comparing more than Visualizing group differences 175 Intermediate methods. 177 Regression 8.1 179 The many faces of regression Scenarios for using OLS regression know 182 8.2 OLS regression 180 181 ■ What you need to 183 Fitting regression models with lm() 184 ■ Simple linear regression 185 ■ Polynomial regression 188 · Multiple linear regression 190 · Multiple linear regression with interactions 192
xiii CONTENTS 8.3 Regression diagnostics 194 A typical approach 195 · An enhanced approach Multicollinearity 202 8.4 Unusual observations 203 Outliers 203 ■ High-leverage points observations 204 8.5 Corrective measures 197 203 ■ Influential 207 Deleting observations 208 · Transforming variables 208 Adding or deleting variables 210· Trying a different approach 210 8.6 Selecting the “best” regression model Comparing models 8.7 211 ■ Variable selection Taking the analysis further Cross-validation 211 212 215 215 ■ Relative importance 217 Analysis of variance 221 9.1 A crash course on terminology 9.2 Fitting ANOVA models One-way ANOVA One-way ANCOVA 228 ■ Assessing test assumptions 235 · Visualizing the 9.5 Two-way factorial ANOVA 9.6 Repeated measures ANOVA 9.7 Multivariate analysis of variance (MANOVA) Assessing test assumptions ANOVA as regression ļ fj Power analysis 232 233 Assessing test assumptions results 236 9.8 225 226 Multiple comparisons 9.4 224 224 · The order offormula terms The aov() function 9.3 222 237 239 242 244 ■ Robust MANOVA 245 246 249 10.1 A quick review of hypothesis testing 10.2 Implementing power analysis with the pwr package 250 T-tests 253 ■ ANOVA 255 ■ Correlations 255 Linear models 256 · Tests ofproportions 257 Chi-square tests 258 · Choosing an appropriate effect size in novel situations 259 252
CONTENTS xiv 10.3 Creating power analysis plots 10.4 Other packages 262 263 265 Intermediate graphs 11.1 Scatter plots 266 Scatter plot matrices 269 ■ High-density scatter plots 272 3D scatter plots 275 ■ Spinning 3D scatter plots 277 Bubble plots 279 11.2 Line charts 282 11.3 Corrgrams 284 11.4 Mosaic plots 289 Resampling statistics and bootstrapping 293 12.1 Permutation tests 12.2 Permutation tests with the coin package 294 296 Independent two-sample and k-sample tests 297 ■ Independence in contingency tables 298 ■ Independence between numeric variables 299 · Dependent two-sample and k-sample tests 300 Goingfurther 300 12.3 Permutation tests with the ImPerm package Simple and polynomial regression One-way ANOVA and ANCOVA 301 ■ Multiple regression 302 303 ■ Two-way ANOVA 304 12.4 Additional comments on permutation tests 12.5 Bootstrapping 12.6 Bootstrapping with the boot package 304 305 Bootstrapping a single statistic statistics 309 Part 4 300 306 307 ■ Bootstrapping several 313 Advanced methods Generalized linear models 13.1 315 Generalized linear models and the glm() function The glm() function 317· Supportingfunctions Modelfit and regression diagnostics 319 13.2 Logistic regression 318 320 Interpreting the model parameters 323 ■ Assessing the impact ofpredictors on the probability of an outcome 323 Overdispersion 324 ■ Extensions 325 316
CONTENTS 13.3 Poisson regression XV 326 Interpreting the model parameters Extensions 331 328· Overdispersion 329 Principal components andfactor analysis 333 14.1 Principal components and factor analysis in R 14.2 Principal components 335 336 Selecting the number of components to extract 337 Extracting principal components 338 ■ Rotating principal components 342 · Obtainingprincipal component scores 343 14.3 Exploratory factor analysis 345 Deciding how many common factors to extract 346 Extracting common factors 347 · Rotatingfactors 348 Factor scores 352 ■ Other EFA-related packages 352 14.4 Other latent variable models Time series 352 355 15.1 Creating a time-series object in R 15.2 Smoothing and seasonal decomposition 358 Smoothing with simple moving averages decomposition 362 15.3 Exponential forecasting models 360 360 · Seasonal 368 Simple exponential smoothing 369 ■ Holt and Holt-Winters exponential smoothing 372 · The ets()function and automatedforecasting 374 15.4 ARIMA forecasting models 376 Prerequisite concepts 376 · ARMA and ARIMA models Automated ARIMA forecasting 383 15.5 Going further Cluster analysis 384 386 16.1 Common steps in cluster analysis 16.2 Calculating distances 16.3 Hierarchical cluster analysis 391 16.4 Partitioning-cluster analysis 396 388 390 К-means clustering 396 ■ Partitioning around medoids 403 378
CONTENTS 16.5 Avoiding nonexistent clusters 404 16.6 Going further 408 Classification 409 17.1 17.2 17.3 Preparing the data 410 Logistic regression 412 Decision trees 413 Classical decision trees 413 ■ Conditional inference trees 417 17.4 17.5 Random forests 418 Support vector machines 421 Tuning an SVM 423 Choosing a best predictive solution 425 Understanding black box predictions 428 Break-down plots 428 Plotting Shapley values 431 17.6 17.7 17.8 Going further 432 Advanced methodsfor missing data 434 18.1 18.2 18.3 Steps in dealing with missing data 435 Identifying missing values 437 Exploring missing-values patterns 438 Visualizing missing values 439 ■ Using correlations to explore missing values 442 18.4 Understanding the sources and impact of missing data 444 Rational approaches for dealing with incomplete data 445 Deleting missing data 446 Complete-case analysis (listwise deletion) 446 · Available case analysis (pairwise deletion) 448 18.5 18.6 18.7 Single imputation 448 Simple imputation 449 ■ К-nearest neighbor imputation 449 missForest 450 18.8 18.9 Multiple imputation 451 Other approaches to missing data 455
xvii CONTENTS Part 5 Expanding your skills . 457 Advanced graphs 459 19.1 Modifying scales 460 Customizing axes 460 ■ Customizing colors 466 19.2 Modifying themes 470 Prepackaged themes 471 ■ Customizingfonts 472 Customizing legends 475 ■ Customizing the plot area 477 19.3 19.4 19.5 Adding annotations 478 Combining graphs 485 Making graphs interactive 487 Advanced programming 491 21 20.1 A review of the language 492 Data types 492 ■ Control structures 498 ■ Creating functions 501 20.2 20.3 20.4 Working with environments 503 Non-standard evaluation 505 Object-oriented programming 508 Generic functions 508 · Limitations of the S3 model 510 20.5 Writing efficient code 510 Efficient data input 510· Vectorization 511 · Correctly sizing objects 512 ■ Parallelization 512 20.6 Debugging 514 Common sources of errors 514 ■ Debugging tools 515 Session options that support debugging 518· UsingRStudio’s visual debugger 521 20.7 Going further 523 Creating dynamic reports 525 21.1 21.2 21.3 A template approach to reports 528 Creating a report with R and R Markdown Creating a report with R and LaTeX 534 Creating a parameterized report 536 21.4 21.5 Avoiding common R Markdown problems 540 Going further 541 529
xviii CONTENTS Q Creating a package 543 22.1 The edatools package 22.2 Creating a package 544 546 Installing development took 546 ■ Creating a package project 547 ■ Writing the packagefunctions 547 Addingfunction documentation 552 ■ Adding a general help file (optional) 554 ■ Adding sample data to the package (optional) 555 · Adding a vignette (optional) 556 Editing the DESCRIPTIONfile 557 ■ Building and installing the package 55 8 22.3 562 Sharing your package Distributing a source packagefile 562 ■ Submitting to CRAN 562 ■ Hosting on GitHub 563 ■ Creating a package website 565 22.4 afterword Going further 567 Into the rabbit hole 568 appendix A Graphical user interfaces appendix В Customizing the startup environment appendixC Exporting data from R appendix D Matrix algebra in R appendix E Packages used in this book appendix F Working with large datasets 587 appendix G Updating an R installation 592 references 595 index 599 5 71 577 579 581 574 |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
author | Kabacoff, Robert |
author_GND | (DE-588)14294372X |
author_facet | Kabacoff, Robert |
author_role | aut |
author_sort | Kabacoff, Robert |
author_variant | r k rk |
building | Verbundindex |
bvnumber | BV047471764 |
classification_rvk | CM 3000 ST 250 ST 601 WC 7000 MR 2200 |
classification_tum | DAT 307 DAT 754 MAT 620 |
ctrlnum | (OCoLC)1334027548 (DE-599)BVBBV047471764 |
dewey-full | 519.502855133 |
dewey-hundreds | 500 - Natural sciences and mathematics |
dewey-ones | 519 - Probabilities and applied mathematics |
dewey-raw | 519.502855133 |
dewey-search | 519.502855133 |
dewey-sort | 3519.502855133 |
dewey-tens | 510 - Mathematics |
discipline | Biologie Informatik Soziologie Psychologie Mathematik |
discipline_str_mv | Biologie Informatik Soziologie Psychologie Mathematik |
edition | Third edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01885nam a2200493 c 4500</leader><controlfield tag="001">BV047471764</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20230310 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">210916s2022 xxua||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781617296055</subfield><subfield code="q">paperback</subfield><subfield code="9">978-1-61729-605-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1334027548</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047471764</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">xxu</subfield><subfield code="c">US</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-11</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-1050</subfield><subfield code="a">DE-M49</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">519.502855133</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">CM 3000</subfield><subfield code="0">(DE-625)18945:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 601</subfield><subfield code="0">(DE-625)143682:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">WC 7000</subfield><subfield code="0">(DE-625)148142:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MR 2200</subfield><subfield code="0">(DE-625)123489:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 307</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 754</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MAT 620</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Kabacoff, Robert</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)14294372X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">R in action</subfield><subfield code="b">data analysis and graphics with R and tidyverse</subfield><subfield code="c">Robert I. Kabacoff</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Third edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Shelter Island</subfield><subfield code="b">Manning</subfield><subfield code="c">[2022]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2022</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxxi, 622 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield><subfield code="c">24 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032873412&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032873412</subfield></datafield></record></collection> |
id | DE-604.BV047471764 |
illustrated | Illustrated |
index_date | 2024-07-03T18:09:36Z |
indexdate | 2024-07-10T09:13:03Z |
institution | BVB |
isbn | 9781617296055 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032873412 |
oclc_num | 1334027548 |
open_access_boolean | |
owner | DE-11 DE-473 DE-BY-UBG DE-1050 DE-M49 DE-BY-TUM |
owner_facet | DE-11 DE-473 DE-BY-UBG DE-1050 DE-M49 DE-BY-TUM |
physical | xxxi, 622 Seiten Illustrationen, Diagramme 24 cm |
publishDate | 2022 |
publishDateSearch | 2022 |
publishDateSort | 2022 |
publisher | Manning |
record_format | marc |
spelling | Kabacoff, Robert Verfasser (DE-588)14294372X aut R in action data analysis and graphics with R and tidyverse Robert I. Kabacoff Third edition Shelter Island Manning [2022] © 2022 xxxi, 622 Seiten Illustrationen, Diagramme 24 cm txt rdacontent n rdamedia nc rdacarrier Statistik (DE-588)4056995-0 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf R Programm (DE-588)4705956-4 gnd rswk-swf R Programm (DE-588)4705956-4 s Statistik (DE-588)4056995-0 s Datenanalyse (DE-588)4123037-1 s DE-604 Digitalisierung UB Bamberg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032873412&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Kabacoff, Robert R in action data analysis and graphics with R and tidyverse Statistik (DE-588)4056995-0 gnd Datenanalyse (DE-588)4123037-1 gnd R Programm (DE-588)4705956-4 gnd |
subject_GND | (DE-588)4056995-0 (DE-588)4123037-1 (DE-588)4705956-4 |
title | R in action data analysis and graphics with R and tidyverse |
title_auth | R in action data analysis and graphics with R and tidyverse |
title_exact_search | R in action data analysis and graphics with R and tidyverse |
title_exact_search_txtP | R in action data analysis and graphics with R and tidyverse |
title_full | R in action data analysis and graphics with R and tidyverse Robert I. Kabacoff |
title_fullStr | R in action data analysis and graphics with R and tidyverse Robert I. Kabacoff |
title_full_unstemmed | R in action data analysis and graphics with R and tidyverse Robert I. Kabacoff |
title_short | R in action |
title_sort | r in action data analysis and graphics with r and tidyverse |
title_sub | data analysis and graphics with R and tidyverse |
topic | Statistik (DE-588)4056995-0 gnd Datenanalyse (DE-588)4123037-1 gnd R Programm (DE-588)4705956-4 gnd |
topic_facet | Statistik Datenanalyse R Programm |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032873412&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT kabacoffrobert rinactiondataanalysisandgraphicswithrandtidyverse |