Verfügbarkeit: An introduction to statistical learning

An introduction to statistical learning: with applications in Python

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. Thi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	James, Gareth (VerfasserIn), Witten, Daniela (VerfasserIn), Hastie, Trevor 1953- (VerfasserIn), Tibshirani, Robert 1956- (VerfasserIn), Taylor, Jonathan E. (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Cham Springer [2023]
Schriftenreihe:	Springer texts in statistics
Schlagworte:	Python Statistical Theory and Methods Statistics and Computing Applied Statistics Statistics Datenanalyse Statistik Maschinelles Lernen
Online-Zugang:	kostenfrei Inhaltsverzeichnis
Zusammenfassung:	An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data.
Beschreibung:	xv, 607 Seiten Illustrationen, Diagramme
ISBN:	9783031387463 9783031391897

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV049047206
003	DE-604
005	20241128
007	t\|
008	230712s2023 xx a\|\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 9783031387463 \|q hbk \|9 978-3-031-38746-3
020			\|a 9783031391897 \|q pbk : ca. EUR 79.95 \|9 978-3-031-39189-7
035			\|a (OCoLC)1390747667
035			\|a (DE-599)BVBBV049047206
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-473 \|a DE-11 \|a DE-860 \|a DE-384 \|a DE-355 \|a DE-863 \|a DE-188 \|a DE-29T \|a DE-83 \|a DE-521 \|a DE-573 \|a DE-739 \|a DE-703
082	0		\|a 519.5 \|2 23
084			\|a ST 250 \|0 (DE-625)143626: \|2 rvk
084			\|a SK 830 \|0 (DE-625)143259: \|2 rvk
084			\|a XF 3400 \|0 (DE-625)152765: \|2 rvk
084			\|a SK 840 \|0 (DE-625)143261: \|2 rvk
084			\|a 62-04 \|2 msc/2020
084			\|a 62H30 \|2 msc/2020
084			\|a 68T05 \|2 msc/2020
100	1		\|a James, Gareth \|e Verfasser \|0 (DE-588)1038457327 \|4 aut
245	1	0	\|a An introduction to statistical learning \|b with applications in Python \|c Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
264		1	\|a Cham \|b Springer \|c [2023]
264		4	\|c © 2023
300			\|a xv, 607 Seiten \|b Illustrationen, Diagramme
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
490	0		\|a Springer texts in statistics
520	3		\|a An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data.
600	0	7	\|a Python \|0 (DE-588)118793772 \|2 gnd \|9 rswk-swf
650		4	\|a Statistical Theory and Methods
650		4	\|a Statistics and Computing
650		4	\|a Applied Statistics
650		4	\|a Statistics
650	0	7	\|a Datenanalyse \|0 (DE-588)4123037-1 \|2 gnd \|9 rswk-swf
650	0	7	\|a Statistik \|0 (DE-588)4056995-0 \|2 gnd \|9 rswk-swf
650	0	7	\|a Maschinelles Lernen \|0 (DE-588)4193754-5 \|2 gnd \|9 rswk-swf
689	0	0	\|a Python \|0 (DE-588)118793772 \|D p
689	0	1	\|a Maschinelles Lernen \|0 (DE-588)4193754-5 \|D s
689	0	2	\|a Datenanalyse \|0 (DE-588)4123037-1 \|D s
689	0	3	\|a Statistik \|0 (DE-588)4056995-0 \|D s
689	0		\|5 DE-604
700	1		\|a Witten, Daniela \|e Verfasser \|0 (DE-588)108120849X \|4 aut
700	1		\|a Hastie, Trevor \|d 1953- \|e Verfasser \|0 (DE-588)172128242 \|4 aut
700	1		\|a Tibshirani, Robert \|d 1956- \|e Verfasser \|0 (DE-588)172417740 \|4 aut
700	1		\|a Taylor, Jonathan E. \|e Verfasser \|0 (DE-588)102963100X \|4 aut
776	0	8	\|i Erscheint auch als \|n Online-Ausgabe \|z 978-3-031-38747-0 \|w (DE-604)BV049032803
856	4	1	\|u https://www.statlearning.com/ \|x Verlag \|z kostenfrei \|3 Volltext
856	4	2	\|m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=034309627&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
912			\|a ebook
940	1		\|q gbd_0
943	1		\|a oai:aleph.bib-bvb.de:BVB01-034309627

Datensatz im Suchindex

DE-BY-863_location	1000
DE-BY-FWS_call_number	1000/ST 250 J27
DE-BY-FWS_katkey	1069087
DE-BY-FWS_media_number	083101208092
_version_	1824556326871629824
adam_text	Contents Preface 1 2 Introduction Statistical Learning What Is Statistical Learning?. 2.1.1 Why Estimate ƒ?. 2.1.2 How Do We Estimate ƒ?. 2.1.3 The Trade-Off Between Prediction Accuracy and Model Interpretability. 2.1.4 Supervised Versus Unsupervised Learning. 2.1.5 Regression Versus Classification Problems. 2.2 Assessing Model Accuracy . 2.2.1 Measuring the Quality ofFit . 2.2.2 The Bias-Variance Trade-Off. 2.2.3 The Classification Setting. 2.3 Lab: Introduction to Python. 2.3.1 Getting Started. 2.3.2 Basic Commands. 2.3.3 Introduction to Numerical Python . 2.3.4 Graphics. 2.3.5 Sequences and Slice Notation. 2.3.6 Indexing Data. 2.3.7 Loading Data. 2.3.8 For Loops. 2.3.9 Additional Graphical and Numerical Summaries . . 2.4 Exercises. 2.1 3 Linear Regression 3.1 3.2 Simple Linear Regression. 3.1.1 Estimating the Coefficients . 3.1.2 Assessing the Accuracy of the Coefficient Estimates. 3.1.3 Assessing the Accuracy of the Model. Multiple Linear Regression. 3.2.1 Estimating the Regression Coefficients. vij 1 15 15 17 20 23 25 27 27 28 31 34 40 40 40 42 48 51 51 55 59 61 63 69 70 71 72 77 80 81 ix x Contents 3.3 3.4 3.5 3.6 3.7 4 3.2.2 Some Important Questions. 83 Other Considerations in the Regression Model. 91 3.3.1 Qualitative Predictors. 91 3.3.2 Extensions of the LinearModel. 94 3.3.3 Potential Problems.100 The Marketing Plan. 109 Comparison of Linear Regression with K-Nearest Neighbors. Ill Lab: Linear Regression. 116 3.6.1 Importing packages. 116 3.6.2 Simple Linear Regression. 117 3.6.3 Multiple Linear Regression.122 3.6.4 Multivariate Goodness of Fit .123 3.6.5 Interaction Terms. 124 3.6.6 Non-linear Transformations of the Predictors . . . 125 3.6.7 Qualitative Predictors. 126 Exercises. 127 Classification 135 4.1 An Overview of Classification. 135 4.2 Why Not Linear Regression?. 136 4.3 Logistic Regression . 138 4.3.1 The Logistic Model. 139 4.3.2 Estimating the Regression Coefficients. 140 4.3.3 Making Predictions. 141 4.3.4 Multiple Logistic Regression. 142 4.3.5 Multinomial Logistic Regression.144 4.4 Generative Models for Classification. 146 4.4.1 Linear Discriminant Analysis for p = 1. 147 4.4.2 Linear Discriminant Analysis for p 1. 150 4.4.3 Quadratic Discriminant Analysis. 156 4.4.4 Naive Bayes. 158 4.5 A Comparison of ClassificationMethods . 161 4.5.1 An Analytical Comparison. 161 4.5.2 An Empirical Comparison. 164 4.6 Generalized Linear Models. 167 4.6.1 Linear Regression on the Bikeshare Data. 167 4.6.2 Poisson Regression on the Bikeshare Data. 169 4.6.3 Generalized Linear Models in Greater Generality . 172 4.7 Lab: Logistic Regression, LDA,QDA, and KNN. 173 4.7.1 The Stock Market Data. 173 4.7.2 Logistic Regression. 174 4.7.3 Linear Discriminant Analysis. 179 4.7.4 Quadratic Discriminant Analysis.181 4.7.5 Naive Bayes.182 4.7.6 К-Nearest Neighbors. 183 4.7.7 Linear and Poisson Regression on the Bikeshare Datal88 4.8 Exercises. 193 Contents 5 Resampling Methods 5.1 5.2 5.3 5.4 6.2 6.3 6.4 6.5 6.6 229 Subset Selection. 231 6.1.1 Best Subset Selection. 231 6.1.2 Stepwise Selection . 233 6.1.3 Choosing the Optimal Model. 235 Shrinkage Methods . 240 6.2.1 Ridge Regression. 240 6.2.2 The Lasso. 244 6.2.3 Selecting the Tuning Parameter. 252 Dimension Reduction Methods. 253 6.3.1 Principal Components Regression. 254 6.3.2 Partial Least Squares. 260 Considerations in High Dimensions . 262 6.4.1 High-Dimensional Data. 262 6.4.2 What Goes Wrong in High Dimensions?. 263 6.4.3 Regression in High Dimensions. 265 6.4.4 Interpreting Results in High Dimensions. 266 Lab: Linear Models and Regularization Methods. 267 6.5.1 Subset Selection Methods. 268 6.5.2 Ridge Regression and the Lasso. 273 6.5.3 PCR and PLS Regression. 280 Exercises. 283 7 Moving Beyond Linearity 7.1 7.2 7.3 7.4 201 Cross-Validation. 202 5.1.1 The Validation Set Approach. 202 5.1.2 Leave-One-Out Cross-Validation . 204 5.1.3 ÅJ-Fold Cross-Validation . 206 5.1.4 Bias-Variance Trade-Off for /c-Fold Cross-Validation . 208 5.1.5 Cross-Validation on Classification Problems . 209 The Bootstrap. 212 Lab: Cross-Validation and the Bootstrap. 215 5.3.1 The Validation Set Approach. 216 5.3.2 Cross-Validation .217 5.3.3 The Bootstrap . 220 Exercises. 224 6 Linear Model Selection and Regularization 6.1 xi 289 Polynomial Regression. 290 Step Functions. 292 Basis Functions . 293 Regression Splines. 294 7.4.1 Piecewise Polynomials. 294 7.4.2 Constraints and Splines . 296 7.4.3 The Spline Basis Representation .296 7.4.4 Choosing the Number and Locations of the Knots. 297 7.4.5 Comparison to Polynomial Regression. 299 xii Contents 7.5 7.6 7.7 7.8 7.9 Smoothing Splines. 300 7.5.1 An Overview of Smoothing Splines. 300 7.5.2 Choosing the SmoothingParameter λ . 301 Local Regression.303 Generalized Additive Models. 305 7.7.1 GAMs for Regression Problems. 306 7.7.2 GAMs for Classification Problems. 308 Lab: Non-Linear Modeling. 309 7.8.1 Polynomial Regression and StepFunctions. 310 7.8.2 Splines. 315 7.8.3 Smoothing Splines and GAMs. 317 7.8.4 Local Regression. 324 Exercises. 325 8 Tree-Based Methods 331 8.1 The Basics of Decision Trees. 331 8.1.1 Regression Trees . 331 8.1.2 Classification Trees. 337 8.1.3 Trees Versus Linear Models. 341 8.1.4 Advantages and Disadvantages of Trees. 341 8.2 Bagging, Random Forests, Boosting, and Bayesian Additive Regression Trees. 343 8.2.1 Bagging. 343 8.2.2 Random Forests. 346 8.2.3 Boosting. 347 8.2.4 Bayesian Additive Regression Trees. 350 8.2.5 Summary of Tree Ensemble Methods. 353 8.3 Lab: Tree-Based Methods.354 8.3.1 Fitting Classification Trees. 355 8.3.2 Fitting Regression Trees. 358 8.3.3 Bagging and Random Forests.360 8.3.4 Boosting.361 8.3.5 Bayesian Additive Regression Trees. 362 8.4 Exercises. 363 9 Support Vector Machines 367 9.1 Maximal Margin Classifier. 367 9.1.1 What Is a Hyperplane?. 368 9.1.2 Classification Using a Separating Hyperplane . . . 368 9.1.3 The Maximal Margin Classifier. 370 9.1.4 Construction of the Maximal Margin Classifier . . 372 9.1.5 The Non-separable Case. 372 9.2 Support Vector Classifiers. 373 9.2.1 Overview of the Support Vector Classifier.373 9.2.2 Details of the SupportVector Classifier.374 9.3 Support Vector Machines. 377 9.3.1 Classification with Non-Linear Decision Boundaries . 378 9.3.2 The Support VectorMachine. 379 Contents 9.4 9.5 9.6 9.7 9.3.3 An Application to the Heart Disease Data. 382 SVMs with More than Two Classes. 383 9.4.1 One-Versus-One Classification. 384 9.4.2 One-Versus-All Classification . 384 Relationship to Logistic Regression . 384 Lab: Support Vector Machines. 387 9.6.1 Support Vector Classifier. 387 9.6.2 Support Vector Machine. 390 9.6.3 ROC Curves. 392 9.6.4 SVM with Multiple Classes. 393 9.6.5 Application to Gene Expression Data. 394 Exercises. 395 10 Deep Learning 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 399 Single Layer Neural Networks .400 Multilayer Neural Networks. 402 Convolutional Neural Networks. 406 10.3.1 Convolution Layers. 407 10.3.2 Pooling Layers . 410 10.3.3 Architecture of a Convolutional Neural Network. . 410 10.3.4 Data Augmentation. 411 10.3.5 Results Using a Pretrained Classifier. 412 Document Classification. 413 Recurrent Neural Networks. 416 10.5.1 Sequential Models for Document Classification . . 418 10.5.2 Time Series Forecasting. 420 10.5.3 Summary of RNNs.424 When to Use Deep Learning. 425 Fitting a Neural Network. 427 10.7.1 Backpropagation. 428 10.7.2 Regularization and Stochastic Gradient Descent . . 429 10.7.3 Dropout Learning. 431 10.7.4 Network Tuning. 431 Interpolation and Double Descent. 432 Lab: Deep Learning. 435 10.9.1 Single Layer Network on Hitters Data. 437 10.9.2 Multilayer Network on the MNIST Digit Data . . 444 10.9.3 Convolutional Neural Networks. 448 10.9.4 Using Pretrained CNN Models .452 10.9.5 IMDB Document Classification.454 10.9.6 Recurrent Neural Networks.458 Exercises. 465 11 Survival Analysis and Censored Data 11.1 11.2 11.3 11.4 11.5 xiii 469 Survival and Censoring Times. 470 A Closer Look at Censoring. 470 The Kaplan-Meier Survival Curve. 472 The Log-Rank Test. 474 Regression Models With a Survival Response. 476 xiv Contents 11.6 11.7 11.8 11.9 11.5.1 The Hazard Function. 476 11.5.2 Proportional Hazards. 478 11.5.3 Example: Brain Cancer Data.482 11.5.4 Example: Publication Data . 482 Shrinkage for the Cox Model. 484 Additional Topics . 486 11.7.1 Area Under the Curve for Survival Analysis. 486 11.7.2 Choice of Time Scale. 487 11.7.3 Time-Dependent Covariates. 488 11.7.4 Checking the Proportional HazardsAssumption . . 488 11.7.5 Survival Trees. 488 Lab: Survival Analysis. 489 11.8.1 Brain Cancer Data. 489 11.8.2 Publication Data.493 11.8.3 Call Center Data.494 Exercises. 498 12 Unsupervised Learning 503 12.1 The Challenge of Unsupervised Learning. 503 12.2 Principal Components Analysis. 504 12.2.1 What Are Principal Components? .505 12.2.2 Another Interpretation of PrincipalComponents . 508 12.2.3 The Proportion of Variance Explained. 510 12.2.4 More on PCA. 512 12.2.5 Other Uses for Principal Components. 515 12.3 Missing Values and Matrix Completion. 515 12.4 Clustering Methods. 520 12.4.1 К-Means Clustering. 521 12.4.2 Hierarchical Clustering. 525 12.4.3 Practical Issues in Clustering. 532 12.5 Lab: Unsupervised Learning. . 535 12.5.1 Principal Components Analysis. 535 12.5.2 Matrix Completion. 539 12.5.3 Clustering. 542 12.5.4 NCI60 Data Example. 546 12.6 Exercises. 552 13 Multiple Testing 557 13.1 A Quick Review of Hypothesis Testing . 558 13.1.1 Testing a Hypothesis. 558 13.1.2 Type I and Type II Errors. 562 13.2 The Challenge of Multiple Testing. 563 13.3 The Family-Wise Error Rate.565 13.3.1 What is the Family-Wise Error Rate? . 565 13.3.2 Approaches to Control the Family-Wise Error Rate 567 13.3.3 Trade-Off Between the FWER and Power.572 13.4 The False Discovery Rate.573 13.4.1 Intuition for the False Discovery Rate . 573 13.4.2 The Benjamini-Hochberg Procedure. 575 Contents XV 13.5 A Re-Sampling Approach to p-Values and False Discovery Rates.577 13.5.1 A Re-Sampling Approach to the p-Value. 578 13.5.2 A Re-Sampling Approach to the False Discovery Rate579 13.5.3 When Are Re-Sampling Approaches Useful? . 581 13.6 Lab: Multiple Testing. 583 13.6.1 Review of Hypothesis Tests. 583 13.6.2 Family-Wise Error Rate. 585 13.6.3 False Discovery Rate. 588 13.6.4 A Re-Sampling Approach. 590 13.7 Exercises. 593 Index 597
adam_txt	Contents Preface 1 2 Introduction Statistical Learning What Is Statistical Learning?. 2.1.1 Why Estimate ƒ?. 2.1.2 How Do We Estimate ƒ?. 2.1.3 The Trade-Off Between Prediction Accuracy and Model Interpretability. 2.1.4 Supervised Versus Unsupervised Learning. 2.1.5 Regression Versus Classification Problems. 2.2 Assessing Model Accuracy . 2.2.1 Measuring the Quality ofFit . 2.2.2 The Bias-Variance Trade-Off. 2.2.3 The Classification Setting. 2.3 Lab: Introduction to Python. 2.3.1 Getting Started. 2.3.2 Basic Commands. 2.3.3 Introduction to Numerical Python . 2.3.4 Graphics. 2.3.5 Sequences and Slice Notation. 2.3.6 Indexing Data. 2.3.7 Loading Data. 2.3.8 For Loops. 2.3.9 Additional Graphical and Numerical Summaries . . 2.4 Exercises. 2.1 3 Linear Regression 3.1 3.2 Simple Linear Regression. 3.1.1 Estimating the Coefficients . 3.1.2 Assessing the Accuracy of the Coefficient Estimates. 3.1.3 Assessing the Accuracy of the Model. Multiple Linear Regression. 3.2.1 Estimating the Regression Coefficients. vij 1 15 15 17 20 23 25 27 27 28 31 34 40 40 40 42 48 51 51 55 59 61 63 69 70 71 72 77 80 81 ix x Contents 3.3 3.4 3.5 3.6 3.7 4 3.2.2 Some Important Questions. 83 Other Considerations in the Regression Model. 91 3.3.1 Qualitative Predictors. 91 3.3.2 Extensions of the LinearModel. 94 3.3.3 Potential Problems.100 The Marketing Plan. 109 Comparison of Linear Regression with K-Nearest Neighbors. Ill Lab: Linear Regression. 116 3.6.1 Importing packages. 116 3.6.2 Simple Linear Regression. 117 3.6.3 Multiple Linear Regression.122 3.6.4 Multivariate Goodness of Fit .123 3.6.5 Interaction Terms. 124 3.6.6 Non-linear Transformations of the Predictors . . . 125 3.6.7 Qualitative Predictors. 126 Exercises. 127 Classification 135 4.1 An Overview of Classification. 135 4.2 Why Not Linear Regression?. 136 4.3 Logistic Regression . 138 4.3.1 The Logistic Model. 139 4.3.2 Estimating the Regression Coefficients. 140 4.3.3 Making Predictions. 141 4.3.4 Multiple Logistic Regression. 142 4.3.5 Multinomial Logistic Regression.144 4.4 Generative Models for Classification. 146 4.4.1 Linear Discriminant Analysis for p = 1. 147 4.4.2 Linear Discriminant Analysis for p 1. 150 4.4.3 Quadratic Discriminant Analysis. 156 4.4.4 Naive Bayes. 158 4.5 A Comparison of ClassificationMethods . 161 4.5.1 An Analytical Comparison. 161 4.5.2 An Empirical Comparison. 164 4.6 Generalized Linear Models. 167 4.6.1 Linear Regression on the Bikeshare Data. 167 4.6.2 Poisson Regression on the Bikeshare Data. 169 4.6.3 Generalized Linear Models in Greater Generality . 172 4.7 Lab: Logistic Regression, LDA,QDA, and KNN. 173 4.7.1 The Stock Market Data. 173 4.7.2 Logistic Regression. 174 4.7.3 Linear Discriminant Analysis. 179 4.7.4 Quadratic Discriminant Analysis.181 4.7.5 Naive Bayes.182 4.7.6 К-Nearest Neighbors. 183 4.7.7 Linear and Poisson Regression on the Bikeshare Datal88 4.8 Exercises. 193 Contents 5 Resampling Methods 5.1 5.2 5.3 5.4 6.2 6.3 6.4 6.5 6.6 229 Subset Selection. 231 6.1.1 Best Subset Selection. 231 6.1.2 Stepwise Selection . 233 6.1.3 Choosing the Optimal Model. 235 Shrinkage Methods . 240 6.2.1 Ridge Regression. 240 6.2.2 The Lasso. 244 6.2.3 Selecting the Tuning Parameter. 252 Dimension Reduction Methods. 253 6.3.1 Principal Components Regression. 254 6.3.2 Partial Least Squares. 260 Considerations in High Dimensions . 262 6.4.1 High-Dimensional Data. 262 6.4.2 What Goes Wrong in High Dimensions?. 263 6.4.3 Regression in High Dimensions. 265 6.4.4 Interpreting Results in High Dimensions. 266 Lab: Linear Models and Regularization Methods. 267 6.5.1 Subset Selection Methods. 268 6.5.2 Ridge Regression and the Lasso. 273 6.5.3 PCR and PLS Regression. 280 Exercises. 283 7 Moving Beyond Linearity 7.1 7.2 7.3 7.4 201 Cross-Validation. 202 5.1.1 The Validation Set Approach. 202 5.1.2 Leave-One-Out Cross-Validation . 204 5.1.3 ÅJ-Fold Cross-Validation . 206 5.1.4 Bias-Variance Trade-Off for /c-Fold Cross-Validation . 208 5.1.5 Cross-Validation on Classification Problems . 209 The Bootstrap. 212 Lab: Cross-Validation and the Bootstrap. 215 5.3.1 The Validation Set Approach. 216 5.3.2 Cross-Validation .217 5.3.3 The Bootstrap . 220 Exercises. 224 6 Linear Model Selection and Regularization 6.1 xi 289 Polynomial Regression. 290 Step Functions. 292 Basis Functions . 293 Regression Splines. 294 7.4.1 Piecewise Polynomials. 294 7.4.2 Constraints and Splines . 296 7.4.3 The Spline Basis Representation .296 7.4.4 Choosing the Number and Locations of the Knots. 297 7.4.5 Comparison to Polynomial Regression. 299 xii Contents 7.5 7.6 7.7 7.8 7.9 Smoothing Splines. 300 7.5.1 An Overview of Smoothing Splines. 300 7.5.2 Choosing the SmoothingParameter λ . 301 Local Regression.303 Generalized Additive Models. 305 7.7.1 GAMs for Regression Problems. 306 7.7.2 GAMs for Classification Problems. 308 Lab: Non-Linear Modeling. 309 7.8.1 Polynomial Regression and StepFunctions. 310 7.8.2 Splines. 315 7.8.3 Smoothing Splines and GAMs. 317 7.8.4 Local Regression. 324 Exercises. 325 8 Tree-Based Methods 331 8.1 The Basics of Decision Trees. 331 8.1.1 Regression Trees . 331 8.1.2 Classification Trees. 337 8.1.3 Trees Versus Linear Models. 341 8.1.4 Advantages and Disadvantages of Trees. 341 8.2 Bagging, Random Forests, Boosting, and Bayesian Additive Regression Trees. 343 8.2.1 Bagging. 343 8.2.2 Random Forests. 346 8.2.3 Boosting. 347 8.2.4 Bayesian Additive Regression Trees. 350 8.2.5 Summary of Tree Ensemble Methods. 353 8.3 Lab: Tree-Based Methods.354 8.3.1 Fitting Classification Trees. 355 8.3.2 Fitting Regression Trees. 358 8.3.3 Bagging and Random Forests.360 8.3.4 Boosting.361 8.3.5 Bayesian Additive Regression Trees. 362 8.4 Exercises. 363 9 Support Vector Machines 367 9.1 Maximal Margin Classifier. 367 9.1.1 What Is a Hyperplane?. 368 9.1.2 Classification Using a Separating Hyperplane . . . 368 9.1.3 The Maximal Margin Classifier. 370 9.1.4 Construction of the Maximal Margin Classifier . . 372 9.1.5 The Non-separable Case. 372 9.2 Support Vector Classifiers. 373 9.2.1 Overview of the Support Vector Classifier.373 9.2.2 Details of the SupportVector Classifier.374 9.3 Support Vector Machines. 377 9.3.1 Classification with Non-Linear Decision Boundaries . 378 9.3.2 The Support VectorMachine. 379 Contents 9.4 9.5 9.6 9.7 9.3.3 An Application to the Heart Disease Data. 382 SVMs with More than Two Classes. 383 9.4.1 One-Versus-One Classification. 384 9.4.2 One-Versus-All Classification . 384 Relationship to Logistic Regression . 384 Lab: Support Vector Machines. 387 9.6.1 Support Vector Classifier. 387 9.6.2 Support Vector Machine. 390 9.6.3 ROC Curves. 392 9.6.4 SVM with Multiple Classes. 393 9.6.5 Application to Gene Expression Data. 394 Exercises. 395 10 Deep Learning 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 399 Single Layer Neural Networks .400 Multilayer Neural Networks. 402 Convolutional Neural Networks. 406 10.3.1 Convolution Layers. 407 10.3.2 Pooling Layers . 410 10.3.3 Architecture of a Convolutional Neural Network. . 410 10.3.4 Data Augmentation. 411 10.3.5 Results Using a Pretrained Classifier. 412 Document Classification. 413 Recurrent Neural Networks. 416 10.5.1 Sequential Models for Document Classification . . 418 10.5.2 Time Series Forecasting. 420 10.5.3 Summary of RNNs.424 When to Use Deep Learning. 425 Fitting a Neural Network. 427 10.7.1 Backpropagation. 428 10.7.2 Regularization and Stochastic Gradient Descent . . 429 10.7.3 Dropout Learning. 431 10.7.4 Network Tuning. 431 Interpolation and Double Descent. 432 Lab: Deep Learning. 435 10.9.1 Single Layer Network on Hitters Data. 437 10.9.2 Multilayer Network on the MNIST Digit Data . . 444 10.9.3 Convolutional Neural Networks. 448 10.9.4 Using Pretrained CNN Models .452 10.9.5 IMDB Document Classification.454 10.9.6 Recurrent Neural Networks.458 Exercises. 465 11 Survival Analysis and Censored Data 11.1 11.2 11.3 11.4 11.5 xiii 469 Survival and Censoring Times. 470 A Closer Look at Censoring. 470 The Kaplan-Meier Survival Curve. 472 The Log-Rank Test. 474 Regression Models With a Survival Response. 476 xiv Contents 11.6 11.7 11.8 11.9 11.5.1 The Hazard Function. 476 11.5.2 Proportional Hazards. 478 11.5.3 Example: Brain Cancer Data.482 11.5.4 Example: Publication Data . 482 Shrinkage for the Cox Model. 484 Additional Topics . 486 11.7.1 Area Under the Curve for Survival Analysis. 486 11.7.2 Choice of Time Scale. 487 11.7.3 Time-Dependent Covariates. 488 11.7.4 Checking the Proportional HazardsAssumption . . 488 11.7.5 Survival Trees. 488 Lab: Survival Analysis. 489 11.8.1 Brain Cancer Data. 489 11.8.2 Publication Data.493 11.8.3 Call Center Data.494 Exercises. 498 12 Unsupervised Learning 503 12.1 The Challenge of Unsupervised Learning. 503 12.2 Principal Components Analysis. 504 12.2.1 What Are Principal Components? .505 12.2.2 Another Interpretation of PrincipalComponents . 508 12.2.3 The Proportion of Variance Explained. 510 12.2.4 More on PCA. 512 12.2.5 Other Uses for Principal Components. 515 12.3 Missing Values and Matrix Completion. 515 12.4 Clustering Methods. 520 12.4.1 К-Means Clustering. 521 12.4.2 Hierarchical Clustering. 525 12.4.3 Practical Issues in Clustering. 532 12.5 Lab: Unsupervised Learning. . 535 12.5.1 Principal Components Analysis. 535 12.5.2 Matrix Completion. 539 12.5.3 Clustering. 542 12.5.4 NCI60 Data Example. 546 12.6 Exercises. 552 13 Multiple Testing 557 13.1 A Quick Review of Hypothesis Testing . 558 13.1.1 Testing a Hypothesis. 558 13.1.2 Type I and Type II Errors. 562 13.2 The Challenge of Multiple Testing. 563 13.3 The Family-Wise Error Rate.565 13.3.1 What is the Family-Wise Error Rate? . 565 13.3.2 Approaches to Control the Family-Wise Error Rate 567 13.3.3 Trade-Off Between the FWER and Power.572 13.4 The False Discovery Rate.573 13.4.1 Intuition for the False Discovery Rate . 573 13.4.2 The Benjamini-Hochberg Procedure. 575 Contents XV 13.5 A Re-Sampling Approach to p-Values and False Discovery Rates.577 13.5.1 A Re-Sampling Approach to the p-Value. 578 13.5.2 A Re-Sampling Approach to the False Discovery Rate579 13.5.3 When Are Re-Sampling Approaches Useful? . 581 13.6 Lab: Multiple Testing. 583 13.6.1 Review of Hypothesis Tests. 583 13.6.2 Family-Wise Error Rate. 585 13.6.3 False Discovery Rate. 588 13.6.4 A Re-Sampling Approach. 590 13.7 Exercises. 593 Index 597
any_adam_object	1
any_adam_object_boolean	1
author	James, Gareth Witten, Daniela Hastie, Trevor 1953- Tibshirani, Robert 1956- Taylor, Jonathan E.
author_GND	(DE-588)1038457327 (DE-588)108120849X (DE-588)172128242 (DE-588)172417740 (DE-588)102963100X
author_facet	James, Gareth Witten, Daniela Hastie, Trevor 1953- Tibshirani, Robert 1956- Taylor, Jonathan E.
author_role	aut aut aut aut aut
author_sort	James, Gareth
author_variant	g j gj d w dw t h th r t rt j e t je jet
building	Verbundindex
bvnumber	BV049047206
classification_rvk	ST 250 SK 830 XF 3400 SK 840
collection	ebook
ctrlnum	(OCoLC)1390747667 (DE-599)BVBBV049047206
dewey-full	519.5
dewey-hundreds	500 - Natural sciences and mathematics
dewey-ones	519 - Probabilities and applied mathematics
dewey-raw	519.5
dewey-search	519.5
dewey-sort	3519.5
dewey-tens	510 - Mathematics
discipline	Informatik Mathematik Medizin
discipline_str_mv	Informatik Mathematik
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV049047206</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20241128</controlfield><controlfield tag="007">t\|</controlfield><controlfield tag="008">230712s2023 xx a\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783031387463</subfield><subfield code="q">hbk</subfield><subfield code="9">978-3-031-38746-3</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783031391897</subfield><subfield code="q">pbk : ca. EUR 79.95</subfield><subfield code="9">978-3-031-39189-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1390747667</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV049047206</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-473</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-860</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-863</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-29T</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-521</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-703</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">519.5</subfield><subfield code="2">23</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 830</subfield><subfield code="0">(DE-625)143259:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">XF 3400</subfield><subfield code="0">(DE-625)152765:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 840</subfield><subfield code="0">(DE-625)143261:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">62-04</subfield><subfield code="2">msc/2020</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">62H30</subfield><subfield code="2">msc/2020</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">68T05</subfield><subfield code="2">msc/2020</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">James, Gareth</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1038457327</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">An introduction to statistical learning</subfield><subfield code="b">with applications in Python</subfield><subfield code="c">Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cham</subfield><subfield code="b">Springer</subfield><subfield code="c">[2023]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2023</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xv, 607 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Springer texts in statistics</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data.</subfield></datafield><datafield tag="600" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="0">(DE-588)118793772</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistical Theory and Methods</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistics and Computing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Applied Statistics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistics </subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Python</subfield><subfield code="0">(DE-588)118793772</subfield><subfield code="D">p</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Witten, Daniela</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)108120849X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hastie, Trevor</subfield><subfield code="d">1953-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)172128242</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tibshirani, Robert</subfield><subfield code="d">1956-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)172417740</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Taylor, Jonathan E.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)102963100X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-3-031-38747-0</subfield><subfield code="w">(DE-604)BV049032803</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://www.statlearning.com/</subfield><subfield code="x">Verlag</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=034309627&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ebook</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">gbd_0</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-034309627</subfield></datafield></record></collection>
id	DE-604.BV049047206
illustrated	Illustrated
index_date	2024-07-03T22:20:35Z
indexdate	2025-02-20T07:21:47Z
institution	BVB
isbn	9783031387463 9783031391897
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-034309627
oclc_num	1390747667
open_access_boolean	1
owner	DE-473 DE-BY-UBG DE-11 DE-860 DE-384 DE-355 DE-BY-UBR DE-863 DE-BY-FWS DE-188 DE-29T DE-83 DE-521 DE-573 DE-739 DE-703
owner_facet	DE-473 DE-BY-UBG DE-11 DE-860 DE-384 DE-355 DE-BY-UBR DE-863 DE-BY-FWS DE-188 DE-29T DE-83 DE-521 DE-573 DE-739 DE-703
physical	xv, 607 Seiten Illustrationen, Diagramme
psigel	ebook gbd_0
publishDate	2023
publishDateSearch	2023
publishDateSort	2023
publisher	Springer
record_format	marc
series2	Springer texts in statistics
spellingShingle	James, Gareth Witten, Daniela Hastie, Trevor 1953- Tibshirani, Robert 1956- Taylor, Jonathan E. An introduction to statistical learning with applications in Python Python (DE-588)118793772 gnd Statistical Theory and Methods Statistics and Computing Applied Statistics Statistics Datenanalyse (DE-588)4123037-1 gnd Statistik (DE-588)4056995-0 gnd Maschinelles Lernen (DE-588)4193754-5 gnd
subject_GND	(DE-588)118793772 (DE-588)4123037-1 (DE-588)4056995-0 (DE-588)4193754-5
title	An introduction to statistical learning with applications in Python
title_auth	An introduction to statistical learning with applications in Python
title_exact_search	An introduction to statistical learning with applications in Python
title_exact_search_txtP	An Introduction to Statistical Learning with Applications in Python
title_full	An introduction to statistical learning with applications in Python Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
title_fullStr	An introduction to statistical learning with applications in Python Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
title_full_unstemmed	An introduction to statistical learning with applications in Python Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
title_short	An introduction to statistical learning
title_sort	an introduction to statistical learning with applications in python
title_sub	with applications in Python
topic	Python (DE-588)118793772 gnd Statistical Theory and Methods Statistics and Computing Applied Statistics Statistics Datenanalyse (DE-588)4123037-1 gnd Statistik (DE-588)4056995-0 gnd Maschinelles Lernen (DE-588)4193754-5 gnd
topic_facet	Python Statistical Theory and Methods Statistics and Computing Applied Statistics Statistics Datenanalyse Statistik Maschinelles Lernen
url	https://www.statlearning.com/ http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=034309627&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT jamesgareth anintroductiontostatisticallearningwithapplicationsinpython AT wittendaniela anintroductiontostatisticallearningwithapplicationsinpython AT hastietrevor anintroductiontostatisticallearningwithapplicationsinpython AT tibshiranirobert anintroductiontostatisticallearningwithapplicationsinpython AT taylorjonathane anintroductiontostatisticallearningwithapplicationsinpython

Verfügbarkeit

Volltext öffnen

THWS Würzburg Zentralbibliothek Lesesaal

Bestandesangaben von THWS Würzburg Zentralbibliothek Lesesaal
Signatur:	1000 ST 250 J27
Exemplar 1	ausleihbar Checked out – Rückgabe bis: 23.06.2025 Vormerken

MARC

Datensatz im Suchindex

THWS Würzburg Zentralbibliothek Lesesaal

Ähnliche Einträge