Introduction to machine learning with Python: a guide for data scientists
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
© 2016
|
Ausgabe: | First edition |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Hier auch später erschienene, unveränderte Nachdrucke (2017) |
Beschreibung: | xii, 378 Seiten Illustrationen |
ISBN: | 9781449369415 1449369413 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043292304 | ||
003 | DE-604 | ||
005 | 20181119 | ||
007 | t | ||
008 | 160119s2016 a||| |||| 00||| eng d | ||
020 | |a 9781449369415 |9 978-1-449-36941-5 | ||
020 | |a 1449369413 |9 1-449-36941-3 | ||
035 | |a (OCoLC)964453057 | ||
035 | |a (DE-599)BVBBV043292304 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-739 |a DE-862 |a DE-1051 |a DE-19 |a DE-573 |a DE-Aug4 |a DE-11 |a DE-523 |a DE-91G |a DE-91 |a DE-83 |a DE-188 |a DE-521 |a DE-861 |a DE-M382 |a DE-29T |a DE-355 | ||
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
084 | |a DAT 708f |2 stub | ||
084 | |a 68P01 |2 msc | ||
084 | |a DAT 366f |2 stub | ||
100 | 1 | |a Müller, Andreas Christian |0 (DE-588)1060129469 |4 aut | |
245 | 1 | 0 | |a Introduction to machine learning with Python |b a guide for data scientists |c Andreas C. Müller and Sarah Guido |
250 | |a First edition | ||
264 | 0 | |a Beijing ; Boston ; Farnham ; Sebastopol ; Tokyo |b O'Reilly |c [October 2016] | |
264 | 4 | |c © 2016 | |
300 | |a xii, 378 Seiten |b Illustrationen | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Hier auch später erschienene, unveränderte Nachdrucke (2017) | ||
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | 1 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Guido, Sarah |0 (DE-588)1117052265 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-449-36990-3 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028713402&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-028713402 |
Datensatz im Suchindex
DE-BY-862_location | 2000 |
---|---|
DE-BY-FWS_call_number | 2000/ST 250 P99 M946 I6 |
DE-BY-FWS_katkey | 627906 |
DE-BY-FWS_media_number | 083000515978 |
_version_ | 1808867036160327680 |
adam_text |
Table of Contents
Preface. --- .vii
1. Introduction. .1
Why Machine Learning? 1
Problems Machine Learning Can Solve 2
Knowing Your Task and Knowing Your Data 4
Why Python? 5
scikit-learn 5
Installing scikit-learn 6
Essential Libraries and Tools 7
Jupyter Notebook 7
NumPy 7
SciPy 8
matplotlib 9
pandas 10
mglearn 11
Python 2 Versus Python 3 12
Versions Used in this Book 12
A First Application: Classifying Iris Species 13
Meet the Data 14
Measuring Success: Training and Testing Data 17
First Things First: Look at Your Data 19
Building Your First Model: k-Nearest Neighbors 20
Making Predictions 22
Evaluating the Model 22
Summary and Outlook 23
jii
2. Supervised Learning. 25
Classification and Regression 25
Generalization, Overfitting, and Underfitting 26
Relation of Model Complexity to Dataset Size 29
Supervised Machine Learning Algorithms 29
Some Sample Datasets 30
k-Nearest Neighbors 35
Linear Models 45
Naive Bayes Classifiers 68
Decision Trees 70
Ensembles of Decision Trees 83
Kernelized Support Vector Machines 92
Neural Networks (Deep Learning) 104
Uncertainty Estimates from Classifiers 119
The Decision Function 120
Predicting Probabilities 122
Uncertainty in Multiclass Classification 124
Summary and Outlook 127
3. Unsupervised Learning and Preprocessing. 131
Types of Unsupervised Learning 131
Challenges in Unsupervised Learning 132
Preprocessing and Scaling 132
Different Kinds of Preprocessing 133
Applying Data Transformations 134
Scaling Training and Test Data the Same Way 136
The Effect of Preprocessing on Supervised Learning 138
Dimensionality Reduction, Feature Extraction, and Manifold Learning 140
Principal Component Analysis (PCA) 140
Non-Negative Matrix Factorization (NMF) 156
Manifold Learning with t-SNE 163
Clustering 168
k-Means Clustering 168
Agglomerative Clustering 182
DBSCAN 187
Comparing and Evaluating Clustering Algorithms 191
Summary of Clustering Methods 207
Summary and Outlook 208
4. Representing Data and Engineering Features.211
Categorical Variables 212
One-Hot-Encoding (Dummy Variables) 2.13
iv | Table of Contents
Numbers Can Encode Categoricals 218
Binning, Discretization, Linear Models, and Trees 220
Interactions and Polynomials 224
Univariate Nonlinear Transformations 232
Automatic Feature Selection 236
Univariate Statistics 236
Model-Based Feature Selection 238
Iterative Feature Selection 240
Utilizing Expert Knowledge 242
Summary and Outlook 250
5. Model Evaluation and Improvement. 251
Cross-Validation 252
Cross-Validation in scikit-learn 253
Benefits of Cross-Validation 254
Stratified k-Fold Cross-Validation and Other Strategies 254
Grid Search 260
Simple Grid Search 261
The Danger of Overfitting the Parameters and the Validation Set 261
Grid Search with Cross-Validation 263
Evaluation Metrics and Scoring 275
Keep the End Goal in Mind 275
Metrics for Binary Classification 276
Metrics for Multiclass Classification 296
Regression Metrics 299
Using Evaluation Metrics in Model Selection 300
Summary and Outlook 302
6. Algorithm Chains and Pipelines. 305
Parameter Selection with Preprocessing 306
Building Pipelines 308
Using Pipelines in Grid Searches 309
The General Pipeline Interface 312
Convenient Pipeline Creation with make_pipeline 313
Accessing Step Attributes 314
Accessing Attributes in a Grid-Searched Pipeline 315
Grid-Searching Preprocessing Steps and Model Parameters 317
Grid-Searching Which Model To Use 319
Summary and Outlook 320
7. Working with Text Data.,
000000«00900 Ï00003000003009JQ
OOOOOO
Types of Data Represented as Strings
323
Table of Contents | v
Example Application: Sentiment Analysis of Movie Reviews 325
Representing Text Data as a Bag of Words 327
Applying Bag-oT Words to a Toy Dataset 329
Bag-of-Words for Movie Reviews 330
Stopwords 334
Rescaling the Data with tf—idf 336
Investigating Model Coefficients 338
Bag-of-Words with More Than One Word (n-Grams) 339
Advanced Tokenization, Stemming, and Lemmatization 344
Topic Modeling and Document Clustering 347
Latent Dirichlet Allocation 348
Summary and Outlook 355
8. Wrapping Up. 357
Approaching a Machine Learning Problem 357
Humans in the Loop 358
From Prototype to Production 359
Testing Production Systems 359
Building Your Own Estimator 360
Where to Go from Here 361
Theory 361
Other Machine Learning Frameworks and Packages 362
Ranking, Recommender Systems, and Other Kinds of Learning 363
Probabilistic Modeling, Inference, and Probabilistic Programming 363
Neural Networks 364
Scaling to Larger Datasets 364
Honing Your Skills 365
Conclusion 366
Index. 367
vi j Table of Contents |
any_adam_object | 1 |
author | Müller, Andreas Christian Guido, Sarah |
author_GND | (DE-588)1060129469 (DE-588)1117052265 |
author_facet | Müller, Andreas Christian Guido, Sarah |
author_role | aut aut |
author_sort | Müller, Andreas Christian |
author_variant | a c m ac acm s g sg |
building | Verbundindex |
bvnumber | BV043292304 |
classification_rvk | ST 250 ST 300 |
classification_tum | DAT 708f DAT 366f |
ctrlnum | (OCoLC)964453057 (DE-599)BVBBV043292304 |
discipline | Informatik |
edition | First edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV043292304</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20181119</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160119s2016 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781449369415</subfield><subfield code="9">978-1-449-36941-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1449369413</subfield><subfield code="9">1-449-36941-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)964453057</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043292304</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-862</subfield><subfield code="a">DE-1051</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-Aug4</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-523</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-521</subfield><subfield code="a">DE-861</subfield><subfield code="a">DE-M382</subfield><subfield code="a">DE-29T</subfield><subfield code="a">DE-355</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 708f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">68P01</subfield><subfield code="2">msc</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 366f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Müller, Andreas Christian</subfield><subfield code="0">(DE-588)1060129469</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Introduction to machine learning with Python</subfield><subfield code="b">a guide for data scientists</subfield><subfield code="c">Andreas C. Müller and Sarah Guido</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition</subfield></datafield><datafield tag="264" ind1=" " ind2="0"><subfield code="a">Beijing ; Boston ; Farnham ; Sebastopol ; Tokyo</subfield><subfield code="b">O'Reilly</subfield><subfield code="c">[October 2016]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xii, 378 Seiten</subfield><subfield code="b">Illustrationen</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Hier auch später erschienene, unveränderte Nachdrucke (2017)</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guido, Sarah</subfield><subfield code="0">(DE-588)1117052265</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-449-36990-3</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028713402&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-028713402</subfield></datafield></record></collection> |
id | DE-604.BV043292304 |
illustrated | Illustrated |
indexdate | 2024-08-31T04:07:13Z |
institution | BVB |
isbn | 9781449369415 1449369413 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-028713402 |
oclc_num | 964453057 |
open_access_boolean | |
owner | DE-739 DE-862 DE-BY-FWS DE-1051 DE-19 DE-BY-UBM DE-573 DE-Aug4 DE-11 DE-523 DE-91G DE-BY-TUM DE-91 DE-BY-TUM DE-83 DE-188 DE-521 DE-861 DE-M382 DE-29T DE-355 DE-BY-UBR |
owner_facet | DE-739 DE-862 DE-BY-FWS DE-1051 DE-19 DE-BY-UBM DE-573 DE-Aug4 DE-11 DE-523 DE-91G DE-BY-TUM DE-91 DE-BY-TUM DE-83 DE-188 DE-521 DE-861 DE-M382 DE-29T DE-355 DE-BY-UBR |
physical | xii, 378 Seiten Illustrationen |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
record_format | marc |
spellingShingle | Müller, Andreas Christian Guido, Sarah Introduction to machine learning with Python a guide for data scientists Python Programmiersprache (DE-588)4434275-5 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)4434275-5 (DE-588)4193754-5 |
title | Introduction to machine learning with Python a guide for data scientists |
title_auth | Introduction to machine learning with Python a guide for data scientists |
title_exact_search | Introduction to machine learning with Python a guide for data scientists |
title_full | Introduction to machine learning with Python a guide for data scientists Andreas C. Müller and Sarah Guido |
title_fullStr | Introduction to machine learning with Python a guide for data scientists Andreas C. Müller and Sarah Guido |
title_full_unstemmed | Introduction to machine learning with Python a guide for data scientists Andreas C. Müller and Sarah Guido |
title_short | Introduction to machine learning with Python |
title_sort | introduction to machine learning with python a guide for data scientists |
title_sub | a guide for data scientists |
topic | Python Programmiersprache (DE-588)4434275-5 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | Python Programmiersprache Maschinelles Lernen |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028713402&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT mullerandreaschristian introductiontomachinelearningwithpythonaguidefordatascientists AT guidosarah introductiontomachinelearningwithpythonaguidefordatascientists |
Inhaltsverzeichnis
THWS Schweinfurt Zentralbibliothek Lesesaal
Signatur: |
2000 ST 250 P99 M946 I6 |
---|---|
Exemplar 1 | ausleihbar Checked out – Rückgabe bis: 10.02.2025 Vormerken |