Python data science essentials: a practitioner's guide covering essential data science principles, tools, and techniques
Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upg...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Birmingham ; Mumbai
Packt
September 2018
|
Ausgabe: | Third edition |
Schlagworte: | |
Online-Zugang: | UBY01 UER01 Inhaltsverzeichnis |
Zusammenfassung: | Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upgrades -- Scientific distributions -- Anaconda -- Leveraging conda to install packages -- Enthought Canopy -- WinPython -- Explaining virtual environments -- Conda for managing environments -- A glance at the essential packages -- NumPy -- SciPy -- pandas -- pandas-profiling -- Scikit-learn -- Jupyter -- JupyterLab -- Matplotlib -- Seaborn -- Statsmodels -- Beautiful Soup -- NetworkX -- NLTK -- Gensim -- PyPy -- XGBoost -- LightGBM -- CatBoost -- TensorFlow -- Keras -- Introducing Jupyter -- Fast installation and first test usage -- Jupyter magic commands -- Installing packages directly from Jupyter Notebooks -- Checking the new JupyterLab environment -- How Jupyter Notebooks can help data scientists -- Alternatives to Jupyter -- Datasets and code used in this book -- Scikit-learn toy datasets -- The MLdata.org and other public repositories for open source data -- LIBSVM data examples -- Loading data directly from CSV or text files -- Scikit-learn sample generators -- Summary -- Chapter 2: Data Munging -- The data science process -- Data loading and preprocessing with pandas -- Fast and easy data loading -- Dealing with problematic data -- Dealing with big datasets -- Accessing other data formats -- Putting data together -- Data preprocessing -- Data selection -- Working with categorical and textual data -- A special type of data - text -- Scraping the web with Beautiful Soup -- Data processing with NumPy -- NumPy's n-dimensional array -- The basics of NumPy ndarray objects -- Creating NumPy arrays -- From lists to unidimensional arrays Controlling memory size -- Heterogeneous lists -- From lists to multidimensional arrays -- Resizing arrays -- Arrays derived from NumPy functions -- Getting an array directly from a file -- Extracting data from pandas -- NumPy fast operation and computations -- Matrix operations -- Slicing and indexing with NumPy arrays -- Stacking NumPy arrays -- Working with sparse arrays -- Summary -- Chapter 3: The Data Pipeline -- Introducing EDA -- Building new features -- Dimensionality reduction -- The covariance matrix -- Principal component analysis -- PCA for big data - RandomizedPCA -- Latent factor analysis -- Linear discriminant analysis -- Latent semantical analysis -- Independent component analysis -- Kernel PCA -- T-SNE -- Restricted Boltzmann Machine -- The detection and treatment of outliers -- Univariate outlier detection -- EllipticEnvelope -- OneClassSVM -- Validation metrics -- Multilabel classification -- Binary classification -- Regression -- Testing and validating -- Cross-validation -- Using cross-validation iterators -- Sampling and bootstrapping -- Hyperparameter optimization -- Building custom scoring functions -- Reducing the grid search runtime -- Feature selection -- Selection based on feature variance -- Univariate selection -- Recursive elimination -- Stability and L1-based selection -- Wrapping everything in a pipeline -- Combining features together and chaining transformations -- Building custom transformation functions -- Summary -- Chapter 4: Machine Learning -- Preparing tools and datasets -- Linear and logistic regression -- Naive Bayes -- K-Nearest Neighbors -- Nonlinear algorithms -- SVM for classification -- SVM for regression -- Tuning SVM -- Ensemble strategies -- Pasting by random samples -- Bagging with weak classifiers -- Random Subspaces and Random Patches -- Random Forests and Extra-Trees Estimating probabilities from an ensemble -- Sequences of models - AdaBoost -- Gradient tree boosting (GTB) -- XGBoost -- LightGBM -- CatBoost -- Dealing with big data -- Creating some big datasets as examples -- Scalability with volume -- Keeping up with velocity -- Dealing with variety -- An overview of Stochastic Gradient Descent (SGD) -- A peek into natural language processing (NLP) -- Word tokenization -- Stemming -- Word tagging -- Named entity recognition (NER) -- Stopwords -- A complete data science example - text classification -- An overview of unsupervised learning -- K-means -- DBSCAN - a density-based clustering technique -- Latent Dirichlet Allocation (LDA) -- Summary -- Chapter 5: Visualization, Insights, and Results -- Introducing the basics of matplotlib -- Trying curve plotting -- Using panels for clearer representations -- Plotting scatterplots for relationships in data -- Histograms -- Bar graphs -- Image visualization -- Selected graphical examples with pandas -- Working with boxplots and histograms -- Plotting scatterplots -- Discovering patterns by parallel coordinates -- Wrapping up matplotlib's commands -- Introducing Seaborn -- Enhancing your EDA capabilities -- Advanced data learning representation -- Learning curves -- Validation curves -- Feature importance for RandomForests -- Gradient Boosting Trees partial dependence plotting -- Creating a prediction server with machine-learning-as-a-service -- Summary -- Chapter 6: Social Network Analysis -- Introduction to graph theory -- Graph algorithms -- Types of node centrality -- Partitioning a network -- Graph loading, dumping, and sampling -- Summary -- Chapter 7: Deep Learning Beyond the Basics -- Approaching deep learning -- Classifying images with CNN -- Using pre-trained models -- Working with temporal sequences -- Summary -- Chapter 8: Spark for Big Data |
Beschreibung: | 1 Online-Ressource (vi, 453 Seiten) Illustrationen, Diagramme |
ISBN: | 9781789531893 |
Internformat
MARC
LEADER | 00000nmm a2200000 c 4500 | ||
---|---|---|---|
001 | BV045372897 | ||
003 | DE-604 | ||
005 | 20230815 | ||
007 | cr|uuu---uuuuu | ||
008 | 181217s2018 |||| o||u| ||||||eng d | ||
020 | |a 9781789531893 |c Online |9 978-1-78953-189-3 | ||
035 | |a (ZDB-30-PQE)EBC5532279 | ||
035 | |a (OCoLC)1079414711 | ||
035 | |a (DE-599)GBV1032472871 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-706 |a DE-83 |a DE-29 | ||
100 | 1 | |a Boschetti, Alberto |e Verfasser |0 (DE-588)1182494579 |4 aut | |
245 | 1 | 0 | |a Python data science essentials |b a practitioner's guide covering essential data science principles, tools, and techniques |
250 | |a Third edition | ||
264 | 1 | |a Birmingham ; Mumbai |b Packt |c September 2018 | |
300 | |a 1 Online-Ressource (vi, 453 Seiten) |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
520 | 3 | |a Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upgrades -- Scientific distributions -- Anaconda -- Leveraging conda to install packages -- Enthought Canopy -- WinPython -- Explaining virtual environments -- Conda for managing environments -- A glance at the essential packages -- NumPy -- SciPy -- pandas -- pandas-profiling -- Scikit-learn -- Jupyter -- JupyterLab -- Matplotlib -- Seaborn -- Statsmodels -- Beautiful Soup -- NetworkX -- NLTK -- Gensim -- PyPy -- XGBoost -- LightGBM -- CatBoost -- TensorFlow -- Keras -- Introducing Jupyter -- Fast installation and first test usage -- Jupyter magic commands -- Installing packages directly from Jupyter Notebooks -- Checking the new JupyterLab environment -- How Jupyter Notebooks can help data scientists -- Alternatives to Jupyter -- Datasets and code used in this book -- Scikit-learn toy datasets -- The MLdata.org and other public repositories for open source data -- LIBSVM data examples -- Loading data directly from CSV or text files -- Scikit-learn sample generators -- Summary -- Chapter 2: Data Munging -- The data science process -- Data loading and preprocessing with pandas -- Fast and easy data loading -- Dealing with problematic data -- Dealing with big datasets -- Accessing other data formats -- Putting data together -- Data preprocessing -- Data selection -- Working with categorical and textual data -- A special type of data - text -- Scraping the web with Beautiful Soup -- Data processing with NumPy -- NumPy's n-dimensional array -- The basics of NumPy ndarray objects -- Creating NumPy arrays -- From lists to unidimensional arrays | |
520 | 3 | |a Controlling memory size -- Heterogeneous lists -- From lists to multidimensional arrays -- Resizing arrays -- Arrays derived from NumPy functions -- Getting an array directly from a file -- Extracting data from pandas -- NumPy fast operation and computations -- Matrix operations -- Slicing and indexing with NumPy arrays -- Stacking NumPy arrays -- Working with sparse arrays -- Summary -- Chapter 3: The Data Pipeline -- Introducing EDA -- Building new features -- Dimensionality reduction -- The covariance matrix -- Principal component analysis -- PCA for big data - RandomizedPCA -- Latent factor analysis -- Linear discriminant analysis -- Latent semantical analysis -- Independent component analysis -- Kernel PCA -- T-SNE -- Restricted Boltzmann Machine -- The detection and treatment of outliers -- Univariate outlier detection -- EllipticEnvelope -- OneClassSVM -- Validation metrics -- Multilabel classification -- Binary classification -- Regression -- Testing and validating -- Cross-validation -- Using cross-validation iterators -- Sampling and bootstrapping -- Hyperparameter optimization -- Building custom scoring functions -- Reducing the grid search runtime -- Feature selection -- Selection based on feature variance -- Univariate selection -- Recursive elimination -- Stability and L1-based selection -- Wrapping everything in a pipeline -- Combining features together and chaining transformations -- Building custom transformation functions -- Summary -- Chapter 4: Machine Learning -- Preparing tools and datasets -- Linear and logistic regression -- Naive Bayes -- K-Nearest Neighbors -- Nonlinear algorithms -- SVM for classification -- SVM for regression -- Tuning SVM -- Ensemble strategies -- Pasting by random samples -- Bagging with weak classifiers -- Random Subspaces and Random Patches -- Random Forests and Extra-Trees | |
520 | 3 | |a Estimating probabilities from an ensemble -- Sequences of models - AdaBoost -- Gradient tree boosting (GTB) -- XGBoost -- LightGBM -- CatBoost -- Dealing with big data -- Creating some big datasets as examples -- Scalability with volume -- Keeping up with velocity -- Dealing with variety -- An overview of Stochastic Gradient Descent (SGD) -- A peek into natural language processing (NLP) -- Word tokenization -- Stemming -- Word tagging -- Named entity recognition (NER) -- Stopwords -- A complete data science example - text classification -- An overview of unsupervised learning -- K-means -- DBSCAN - a density-based clustering technique -- Latent Dirichlet Allocation (LDA) -- Summary -- Chapter 5: Visualization, Insights, and Results -- Introducing the basics of matplotlib -- Trying curve plotting -- Using panels for clearer representations -- Plotting scatterplots for relationships in data -- Histograms -- Bar graphs -- Image visualization -- Selected graphical examples with pandas -- Working with boxplots and histograms -- Plotting scatterplots -- Discovering patterns by parallel coordinates -- Wrapping up matplotlib's commands -- Introducing Seaborn -- Enhancing your EDA capabilities -- Advanced data learning representation -- Learning curves -- Validation curves -- Feature importance for RandomForests -- Gradient Boosting Trees partial dependence plotting -- Creating a prediction server with machine-learning-as-a-service -- Summary -- Chapter 6: Social Network Analysis -- Introduction to graph theory -- Graph algorithms -- Types of node centrality -- Partitioning a network -- Graph loading, dumping, and sampling -- Summary -- Chapter 7: Deep Learning Beyond the Basics -- Approaching deep learning -- Classifying images with CNN -- Using pre-trained models -- Working with temporal sequences -- Summary -- Chapter 8: Spark for Big Data | |
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Massaron, Luca |e Verfasser |0 (DE-588)1104968622 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9781789537864 |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030759353&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
912 | |a ZDB-4-NLEBK |a ZDB-30-PQE |a ZDB-5-WPSE | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-030759353 | ||
966 | e | |u https://ebookcentral.proquest.com/lib/unibwm/detail.action?docID=5532279 |l UBY01 |p ZDB-30-PQE |q UBY01_Einzelkauf18 |x Aggregator |3 Volltext | |
966 | e | |u https://ebookcentral.proquest.com/lib/erlangen/detail.action?docID=5532279 |l UER01 |p ZDB-30-PQE |q UER_PDA_PQE_Kauf_2023 |x Aggregator |3 Volltext |
Datensatz im Suchindex
_version_ | 1804179217609392128 |
---|---|
adam_text | INHALT
YY DER WURZELWERKLER - UEBER DEN AUTOR 4
YY FRIEDEN IST EIN LEBENSWEG 7
YY STILLE IST EIN GEMEINWOHL 25
COMPUTER MACHEN MIT DER KOMMUNIKATION
WAS ZAEUNE MIT WEIDELAND MACHTEN
UND AUTOS MIT STRASSEN
YY DWELLING - WIE SICH ANWOHNER UND
EINWOHNER UNTERSCHEIDEN 37
YY SPEED? 49
WELCHE GESCHWINDIGKEIT?
GEFANGENE DER BESCHLEUNIGUNG & EILE ...
YY SELBER LEBEN STATT GELEBT WERDEN 59
YY SCHOENHEIT & DIE MUELLHALDE 67
YY DAS LAND DER GEFUNDENEN FREUNDSCHAFTEN 79
YY EIN PERSOENLICHES NACHWORT DES HERAUSGEBERS 92
HTTP://D-NB.INFO/1079414711
|
any_adam_object | 1 |
author | Boschetti, Alberto Massaron, Luca |
author_GND | (DE-588)1182494579 (DE-588)1104968622 |
author_facet | Boschetti, Alberto Massaron, Luca |
author_role | aut aut |
author_sort | Boschetti, Alberto |
author_variant | a b ab l m lm |
building | Verbundindex |
bvnumber | BV045372897 |
collection | ZDB-4-NLEBK ZDB-30-PQE ZDB-5-WPSE |
ctrlnum | (ZDB-30-PQE)EBC5532279 (OCoLC)1079414711 (DE-599)GBV1032472871 |
edition | Third edition |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>07491nmm a2200421 c 4500</leader><controlfield tag="001">BV045372897</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20230815 </controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">181217s2018 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781789531893</subfield><subfield code="c">Online</subfield><subfield code="9">978-1-78953-189-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC5532279</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1079414711</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBV1032472871</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-706</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-29</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Boschetti, Alberto</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1182494579</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Python data science essentials</subfield><subfield code="b">a practitioner's guide covering essential data science principles, tools, and techniques</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Third edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham ; Mumbai</subfield><subfield code="b">Packt</subfield><subfield code="c">September 2018</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (vi, 453 Seiten)</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upgrades -- Scientific distributions -- Anaconda -- Leveraging conda to install packages -- Enthought Canopy -- WinPython -- Explaining virtual environments -- Conda for managing environments -- A glance at the essential packages -- NumPy -- SciPy -- pandas -- pandas-profiling -- Scikit-learn -- Jupyter -- JupyterLab -- Matplotlib -- Seaborn -- Statsmodels -- Beautiful Soup -- NetworkX -- NLTK -- Gensim -- PyPy -- XGBoost -- LightGBM -- CatBoost -- TensorFlow -- Keras -- Introducing Jupyter -- Fast installation and first test usage -- Jupyter magic commands -- Installing packages directly from Jupyter Notebooks -- Checking the new JupyterLab environment -- How Jupyter Notebooks can help data scientists -- Alternatives to Jupyter -- Datasets and code used in this book -- Scikit-learn toy datasets -- The MLdata.org and other public repositories for open source data -- LIBSVM data examples -- Loading data directly from CSV or text files -- Scikit-learn sample generators -- Summary -- Chapter 2: Data Munging -- The data science process -- Data loading and preprocessing with pandas -- Fast and easy data loading -- Dealing with problematic data -- Dealing with big datasets -- Accessing other data formats -- Putting data together -- Data preprocessing -- Data selection -- Working with categorical and textual data -- A special type of data - text -- Scraping the web with Beautiful Soup -- Data processing with NumPy -- NumPy's n-dimensional array -- The basics of NumPy ndarray objects -- Creating NumPy arrays -- From lists to unidimensional arrays</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Controlling memory size -- Heterogeneous lists -- From lists to multidimensional arrays -- Resizing arrays -- Arrays derived from NumPy functions -- Getting an array directly from a file -- Extracting data from pandas -- NumPy fast operation and computations -- Matrix operations -- Slicing and indexing with NumPy arrays -- Stacking NumPy arrays -- Working with sparse arrays -- Summary -- Chapter 3: The Data Pipeline -- Introducing EDA -- Building new features -- Dimensionality reduction -- The covariance matrix -- Principal component analysis -- PCA for big data - RandomizedPCA -- Latent factor analysis -- Linear discriminant analysis -- Latent semantical analysis -- Independent component analysis -- Kernel PCA -- T-SNE -- Restricted Boltzmann Machine -- The detection and treatment of outliers -- Univariate outlier detection -- EllipticEnvelope -- OneClassSVM -- Validation metrics -- Multilabel classification -- Binary classification -- Regression -- Testing and validating -- Cross-validation -- Using cross-validation iterators -- Sampling and bootstrapping -- Hyperparameter optimization -- Building custom scoring functions -- Reducing the grid search runtime -- Feature selection -- Selection based on feature variance -- Univariate selection -- Recursive elimination -- Stability and L1-based selection -- Wrapping everything in a pipeline -- Combining features together and chaining transformations -- Building custom transformation functions -- Summary -- Chapter 4: Machine Learning -- Preparing tools and datasets -- Linear and logistic regression -- Naive Bayes -- K-Nearest Neighbors -- Nonlinear algorithms -- SVM for classification -- SVM for regression -- Tuning SVM -- Ensemble strategies -- Pasting by random samples -- Bagging with weak classifiers -- Random Subspaces and Random Patches -- Random Forests and Extra-Trees</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Estimating probabilities from an ensemble -- Sequences of models - AdaBoost -- Gradient tree boosting (GTB) -- XGBoost -- LightGBM -- CatBoost -- Dealing with big data -- Creating some big datasets as examples -- Scalability with volume -- Keeping up with velocity -- Dealing with variety -- An overview of Stochastic Gradient Descent (SGD) -- A peek into natural language processing (NLP) -- Word tokenization -- Stemming -- Word tagging -- Named entity recognition (NER) -- Stopwords -- A complete data science example - text classification -- An overview of unsupervised learning -- K-means -- DBSCAN - a density-based clustering technique -- Latent Dirichlet Allocation (LDA) -- Summary -- Chapter 5: Visualization, Insights, and Results -- Introducing the basics of matplotlib -- Trying curve plotting -- Using panels for clearer representations -- Plotting scatterplots for relationships in data -- Histograms -- Bar graphs -- Image visualization -- Selected graphical examples with pandas -- Working with boxplots and histograms -- Plotting scatterplots -- Discovering patterns by parallel coordinates -- Wrapping up matplotlib's commands -- Introducing Seaborn -- Enhancing your EDA capabilities -- Advanced data learning representation -- Learning curves -- Validation curves -- Feature importance for RandomForests -- Gradient Boosting Trees partial dependence plotting -- Creating a prediction server with machine-learning-as-a-service -- Summary -- Chapter 6: Social Network Analysis -- Introduction to graph theory -- Graph algorithms -- Types of node centrality -- Partitioning a network -- Graph loading, dumping, and sampling -- Summary -- Chapter 7: Deep Learning Beyond the Basics -- Approaching deep learning -- Classifying images with CNN -- Using pre-trained models -- Working with temporal sequences -- Summary -- Chapter 8: Spark for Big Data</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Massaron, Luca</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1104968622</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781789537864</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030759353&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-4-NLEBK</subfield><subfield code="a">ZDB-30-PQE</subfield><subfield code="a">ZDB-5-WPSE</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030759353</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/unibwm/detail.action?docID=5532279</subfield><subfield code="l">UBY01</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">UBY01_Einzelkauf18</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/erlangen/detail.action?docID=5532279</subfield><subfield code="l">UER01</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">UER_PDA_PQE_Kauf_2023</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV045372897 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T08:16:21Z |
institution | BVB |
isbn | 9781789531893 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030759353 |
oclc_num | 1079414711 |
open_access_boolean | |
owner | DE-706 DE-83 DE-29 |
owner_facet | DE-706 DE-83 DE-29 |
physical | 1 Online-Ressource (vi, 453 Seiten) Illustrationen, Diagramme |
psigel | ZDB-4-NLEBK ZDB-30-PQE ZDB-5-WPSE ZDB-30-PQE UBY01_Einzelkauf18 ZDB-30-PQE UER_PDA_PQE_Kauf_2023 |
publishDate | 2018 |
publishDateSearch | 2018 |
publishDateSort | 2018 |
publisher | Packt |
record_format | marc |
spelling | Boschetti, Alberto Verfasser (DE-588)1182494579 aut Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques Third edition Birmingham ; Mumbai Packt September 2018 1 Online-Ressource (vi, 453 Seiten) Illustrationen, Diagramme txt rdacontent c rdamedia cr rdacarrier Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upgrades -- Scientific distributions -- Anaconda -- Leveraging conda to install packages -- Enthought Canopy -- WinPython -- Explaining virtual environments -- Conda for managing environments -- A glance at the essential packages -- NumPy -- SciPy -- pandas -- pandas-profiling -- Scikit-learn -- Jupyter -- JupyterLab -- Matplotlib -- Seaborn -- Statsmodels -- Beautiful Soup -- NetworkX -- NLTK -- Gensim -- PyPy -- XGBoost -- LightGBM -- CatBoost -- TensorFlow -- Keras -- Introducing Jupyter -- Fast installation and first test usage -- Jupyter magic commands -- Installing packages directly from Jupyter Notebooks -- Checking the new JupyterLab environment -- How Jupyter Notebooks can help data scientists -- Alternatives to Jupyter -- Datasets and code used in this book -- Scikit-learn toy datasets -- The MLdata.org and other public repositories for open source data -- LIBSVM data examples -- Loading data directly from CSV or text files -- Scikit-learn sample generators -- Summary -- Chapter 2: Data Munging -- The data science process -- Data loading and preprocessing with pandas -- Fast and easy data loading -- Dealing with problematic data -- Dealing with big datasets -- Accessing other data formats -- Putting data together -- Data preprocessing -- Data selection -- Working with categorical and textual data -- A special type of data - text -- Scraping the web with Beautiful Soup -- Data processing with NumPy -- NumPy's n-dimensional array -- The basics of NumPy ndarray objects -- Creating NumPy arrays -- From lists to unidimensional arrays Controlling memory size -- Heterogeneous lists -- From lists to multidimensional arrays -- Resizing arrays -- Arrays derived from NumPy functions -- Getting an array directly from a file -- Extracting data from pandas -- NumPy fast operation and computations -- Matrix operations -- Slicing and indexing with NumPy arrays -- Stacking NumPy arrays -- Working with sparse arrays -- Summary -- Chapter 3: The Data Pipeline -- Introducing EDA -- Building new features -- Dimensionality reduction -- The covariance matrix -- Principal component analysis -- PCA for big data - RandomizedPCA -- Latent factor analysis -- Linear discriminant analysis -- Latent semantical analysis -- Independent component analysis -- Kernel PCA -- T-SNE -- Restricted Boltzmann Machine -- The detection and treatment of outliers -- Univariate outlier detection -- EllipticEnvelope -- OneClassSVM -- Validation metrics -- Multilabel classification -- Binary classification -- Regression -- Testing and validating -- Cross-validation -- Using cross-validation iterators -- Sampling and bootstrapping -- Hyperparameter optimization -- Building custom scoring functions -- Reducing the grid search runtime -- Feature selection -- Selection based on feature variance -- Univariate selection -- Recursive elimination -- Stability and L1-based selection -- Wrapping everything in a pipeline -- Combining features together and chaining transformations -- Building custom transformation functions -- Summary -- Chapter 4: Machine Learning -- Preparing tools and datasets -- Linear and logistic regression -- Naive Bayes -- K-Nearest Neighbors -- Nonlinear algorithms -- SVM for classification -- SVM for regression -- Tuning SVM -- Ensemble strategies -- Pasting by random samples -- Bagging with weak classifiers -- Random Subspaces and Random Patches -- Random Forests and Extra-Trees Estimating probabilities from an ensemble -- Sequences of models - AdaBoost -- Gradient tree boosting (GTB) -- XGBoost -- LightGBM -- CatBoost -- Dealing with big data -- Creating some big datasets as examples -- Scalability with volume -- Keeping up with velocity -- Dealing with variety -- An overview of Stochastic Gradient Descent (SGD) -- A peek into natural language processing (NLP) -- Word tokenization -- Stemming -- Word tagging -- Named entity recognition (NER) -- Stopwords -- A complete data science example - text classification -- An overview of unsupervised learning -- K-means -- DBSCAN - a density-based clustering technique -- Latent Dirichlet Allocation (LDA) -- Summary -- Chapter 5: Visualization, Insights, and Results -- Introducing the basics of matplotlib -- Trying curve plotting -- Using panels for clearer representations -- Plotting scatterplots for relationships in data -- Histograms -- Bar graphs -- Image visualization -- Selected graphical examples with pandas -- Working with boxplots and histograms -- Plotting scatterplots -- Discovering patterns by parallel coordinates -- Wrapping up matplotlib's commands -- Introducing Seaborn -- Enhancing your EDA capabilities -- Advanced data learning representation -- Learning curves -- Validation curves -- Feature importance for RandomForests -- Gradient Boosting Trees partial dependence plotting -- Creating a prediction server with machine-learning-as-a-service -- Summary -- Chapter 6: Social Network Analysis -- Introduction to graph theory -- Graph algorithms -- Types of node centrality -- Partitioning a network -- Graph loading, dumping, and sampling -- Summary -- Chapter 7: Deep Learning Beyond the Basics -- Approaching deep learning -- Classifying images with CNN -- Using pre-trained models -- Working with temporal sequences -- Summary -- Chapter 8: Spark for Big Data Python Programmiersprache (DE-588)4434275-5 gnd rswk-swf Python Programmiersprache (DE-588)4434275-5 s DE-604 Massaron, Luca Verfasser (DE-588)1104968622 aut Erscheint auch als Druck-Ausgabe 9781789537864 DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030759353&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Boschetti, Alberto Massaron, Luca Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques Python Programmiersprache (DE-588)4434275-5 gnd |
subject_GND | (DE-588)4434275-5 |
title | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_auth | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_exact_search | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_full | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_fullStr | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_full_unstemmed | Python data science essentials a practitioner's guide covering essential data science principles, tools, and techniques |
title_short | Python data science essentials |
title_sort | python data science essentials a practitioner s guide covering essential data science principles tools and techniques |
title_sub | a practitioner's guide covering essential data science principles, tools, and techniques |
topic | Python Programmiersprache (DE-588)4434275-5 gnd |
topic_facet | Python Programmiersprache |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030759353&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT boschettialberto pythondatascienceessentialsapractitionersguidecoveringessentialdatascienceprinciplestoolsandtechniques AT massaronluca pythondatascienceessentialsapractitionersguidecoveringessentialdatascienceprinciplestoolsandtechniques |