Scikit-learn Cookbook: over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation
If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Olton
Packt Publishing
2014
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Zusammenfassung: | If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you |
Beschreibung: | III, 199 Seiten Illustrationen |
ISBN: | 9781783989492 9781783989485 |
Internformat
MARC
LEADER | 00000nam a2200000zc 4500 | ||
---|---|---|---|
001 | BV043634464 | ||
003 | DE-604 | ||
005 | 20160707 | ||
007 | t | ||
008 | 160621s2014 a||| |||| 00||| eng d | ||
020 | |a 9781783989492 |9 978-1-78398-949-2 | ||
020 | |a 9781783989485 |c Print |9 978-1-78398-948-5 | ||
035 | |a (OCoLC)953500395 | ||
035 | |a (DE-599)BVBBV043634464 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-739 | ||
082 | 0 | |a 641.5 | |
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
100 | 1 | |a Hauck, Trent |e Verfasser |4 aut | |
245 | 1 | 0 | |a Scikit-learn Cookbook |b over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
264 | 1 | |a Olton |b Packt Publishing |c 2014 | |
300 | |a III, 199 Seiten |b Illustrationen | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
520 | |a If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you | ||
650 | 4 | |a Cookbooks | |
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029048424&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-029048424 |
Datensatz im Suchindex
_version_ | 1804176370489622528 |
---|---|
adam_text | Table of Contents
Preface ___________________________________________________________________ 1
Chapter 1: Premodel Workflow_________________________________________________7
Introduction 8
Getting sample data from external sources 8
Creating sample data for toy analysis 10
Scaling data to the standard normal 13
Creating binary features through thresholding 16
Working with categorical variables 17
Binarizing label features 20
Imputing missing values through various strategies 22
Using Pipelines for multiple preprocessing steps 25
Reducing dimensionality with PCA 28
Using factor analysis for decomposition 31
Kernel PCA for nonlinear dimensionality reduction 33
Using truncated SVD to reduce dimensionality 36
Decomposition to classify with DictionaryLearning 39
Putting it all together with Pipelines 41
Using Gaussian processes for regression 44
Defining the Gaussian process object directly 50
Using stochastic gradient descent for regression 51
Chapter 2: Working with Linear Models_______________________________________55
Introduction 55
Fitting a line through data 56
Evaluating the linear regression model 58
Using ridge regression to overcome linear regression s shortfalls 63
Optimizing the ridge regression parameter 66
Table of Contents
Using sparsity to regularize models 70
Taking a more fundamental approach to regularization with LARS 72
Using linear methods for classification - logistic regression 75
Directly applying Bayesian ridge regression 79
Using boosting to learn from errors 81
Chapter 3: Building Models with Distance Metrics______________________________85
Introduction 85
Using KMeans to cluster data 86
Optimizing the number of centroids 90
Assessing cluster correctness 93
Using MiniBatch KMeans to handle more data 97
Quantizing an image with KMeans clustering 99
Finding the closest objects in the feature space 102
Probabilistic clustering with Gaussian Mixture Models 105
Using KMeans for outlier detection 111
Using k-NN for regression 115
Chapter 4: Classifying Data with scikit-learn ______________________________ 119
introduction 119
Doing basic classifications with Decision Trees 120
Tuning a Decision Tree model 125
Using many Decision Trees - random forests 130
Tuning a random forest model 134
Classifying data with support vector machines 140
Generalizing with multiclass classification 145
Using LDA for classification 147
Working with QDA - a nonlinear LDA 151
Using Stochastic Gradient Descent for classification 153
Classifying documents with Naïve Bayes 154
Label propagation with semi-supervised learning 157
Chapter 5: Postmodel Workflow ________________________________________________161
Introduction 161
K-fold cross validation 162
Automatic cross validation 164
Cross validation with ShuffleSplit 165
Stratified k-fold 169
Poor man s grid search 172
Brute force grid search 175
Using dummy estimators to compare results 177
Regression model evaluation 180
Table of Contents
Feature selection 184
Feature selection on LI norms 187
Persisting models with joblib 191
Index 195
|
any_adam_object | 1 |
author | Hauck, Trent |
author_facet | Hauck, Trent |
author_role | aut |
author_sort | Hauck, Trent |
author_variant | t h th |
building | Verbundindex |
bvnumber | BV043634464 |
classification_rvk | ST 250 |
ctrlnum | (OCoLC)953500395 (DE-599)BVBBV043634464 |
dewey-full | 641.5 |
dewey-hundreds | 600 - Technology (Applied sciences) |
dewey-ones | 641 - Food and drink |
dewey-raw | 641.5 |
dewey-search | 641.5 |
dewey-sort | 3641.5 |
dewey-tens | 640 - Home and family management |
discipline | Informatik Agrar-/Forst-/Ernährungs-/Haushaltswissenschaft / Gartenbau |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01698nam a2200361zc 4500</leader><controlfield tag="001">BV043634464</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20160707 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160621s2014 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781783989492</subfield><subfield code="9">978-1-78398-949-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781783989485</subfield><subfield code="c">Print</subfield><subfield code="9">978-1-78398-948-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)953500395</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043634464</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">641.5</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Hauck, Trent</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Scikit-learn Cookbook</subfield><subfield code="b">over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Olton</subfield><subfield code="b">Packt Publishing</subfield><subfield code="c">2014</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">III, 199 Seiten</subfield><subfield code="b">Illustrationen</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cookbooks</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029048424&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-029048424</subfield></datafield></record></collection> |
id | DE-604.BV043634464 |
illustrated | Illustrated |
indexdate | 2024-07-10T07:31:06Z |
institution | BVB |
isbn | 9781783989492 9781783989485 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-029048424 |
oclc_num | 953500395 |
open_access_boolean | |
owner | DE-739 |
owner_facet | DE-739 |
physical | III, 199 Seiten Illustrationen |
publishDate | 2014 |
publishDateSearch | 2014 |
publishDateSort | 2014 |
publisher | Packt Publishing |
record_format | marc |
spelling | Hauck, Trent Verfasser aut Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation Olton Packt Publishing 2014 III, 199 Seiten Illustrationen txt rdacontent n rdamedia nc rdacarrier If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you Cookbooks Python Programmiersprache (DE-588)4434275-5 gnd rswk-swf Python Programmiersprache (DE-588)4434275-5 s DE-604 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029048424&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Hauck, Trent Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation Cookbooks Python Programmiersprache (DE-588)4434275-5 gnd |
subject_GND | (DE-588)4434275-5 |
title | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_auth | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_exact_search | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_full | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_fullStr | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_full_unstemmed | Scikit-learn Cookbook over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
title_short | Scikit-learn Cookbook |
title_sort | scikit learn cookbook over 50 recipes in corporate scikit learn into every step of the data science pipeline from feature extraction to model building and model evaluation |
title_sub | over 50 recipes in corporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation |
topic | Cookbooks Python Programmiersprache (DE-588)4434275-5 gnd |
topic_facet | Cookbooks Python Programmiersprache |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029048424&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT haucktrent scikitlearncookbookover50recipesincorporatescikitlearnintoeverystepofthedatasciencepipelinefromfeatureextractiontomodelbuildingandmodelevaluation |