Python data science essentials: a practitioner's guide covering essential data science principles, tools, and techniques

Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upg...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Boschetti, Alberto (VerfasserIn), Massaron, Luca (VerfasserIn)
Format: Elektronisch E-Book
Sprache:English
Veröffentlicht: Birmingham ; Mumbai Packt September 2018
Ausgabe:Third edition
Schlagworte:
Online-Zugang:UBY01
UER01
Inhaltsverzeichnis
Zusammenfassung:Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: First Steps -- Introducing data science and Python -- Installing Python -- Python 2 or Python 3? -- Step-by-step installation -- Installing the necessary packages -- Package upgrades -- Scientific distributions -- Anaconda -- Leveraging conda to install packages -- Enthought Canopy -- WinPython -- Explaining virtual environments -- Conda for managing environments -- A glance at the essential packages -- NumPy -- SciPy -- pandas -- pandas-profiling -- Scikit-learn -- Jupyter -- JupyterLab -- Matplotlib -- Seaborn -- Statsmodels -- Beautiful Soup -- NetworkX -- NLTK -- Gensim -- PyPy -- XGBoost -- LightGBM -- CatBoost -- TensorFlow -- Keras -- Introducing Jupyter -- Fast installation and first test usage -- Jupyter magic commands -- Installing packages directly from Jupyter Notebooks -- Checking the new JupyterLab environment -- How Jupyter Notebooks can help data scientists -- Alternatives to Jupyter -- Datasets and code used in this book -- Scikit-learn toy datasets -- The MLdata.org and other public repositories for open source data -- LIBSVM data examples -- Loading data directly from CSV or text files -- Scikit-learn sample generators -- Summary -- Chapter 2: Data Munging -- The data science process -- Data loading and preprocessing with pandas -- Fast and easy data loading -- Dealing with problematic data -- Dealing with big datasets -- Accessing other data formats -- Putting data together -- Data preprocessing -- Data selection -- Working with categorical and textual data -- A special type of data - text -- Scraping the web with Beautiful Soup -- Data processing with NumPy -- NumPy's n-dimensional array -- The basics of NumPy ndarray objects -- Creating NumPy arrays -- From lists to unidimensional arrays
Controlling memory size -- Heterogeneous lists -- From lists to multidimensional arrays -- Resizing arrays -- Arrays derived from NumPy functions -- Getting an array directly from a file -- Extracting data from pandas -- NumPy fast operation and computations -- Matrix operations -- Slicing and indexing with NumPy arrays -- Stacking NumPy arrays -- Working with sparse arrays -- Summary -- Chapter 3: The Data Pipeline -- Introducing EDA -- Building new features -- Dimensionality reduction -- The covariance matrix -- Principal component analysis -- PCA for big data - RandomizedPCA -- Latent factor analysis -- Linear discriminant analysis -- Latent semantical analysis -- Independent component analysis -- Kernel PCA -- T-SNE -- Restricted Boltzmann Machine -- The detection and treatment of outliers -- Univariate outlier detection -- EllipticEnvelope -- OneClassSVM -- Validation metrics -- Multilabel classification -- Binary classification -- Regression -- Testing and validating -- Cross-validation -- Using cross-validation iterators -- Sampling and bootstrapping -- Hyperparameter optimization -- Building custom scoring functions -- Reducing the grid search runtime -- Feature selection -- Selection based on feature variance -- Univariate selection -- Recursive elimination -- Stability and L1-based selection -- Wrapping everything in a pipeline -- Combining features together and chaining transformations -- Building custom transformation functions -- Summary -- Chapter 4: Machine Learning -- Preparing tools and datasets -- Linear and logistic regression -- Naive Bayes -- K-Nearest Neighbors -- Nonlinear algorithms -- SVM for classification -- SVM for regression -- Tuning SVM -- Ensemble strategies -- Pasting by random samples -- Bagging with weak classifiers -- Random Subspaces and Random Patches -- Random Forests and Extra-Trees
Estimating probabilities from an ensemble -- Sequences of models - AdaBoost -- Gradient tree boosting (GTB) -- XGBoost -- LightGBM -- CatBoost -- Dealing with big data -- Creating some big datasets as examples -- Scalability with volume -- Keeping up with velocity -- Dealing with variety -- An overview of Stochastic Gradient Descent (SGD) -- A peek into natural language processing (NLP) -- Word tokenization -- Stemming -- Word tagging -- Named entity recognition (NER) -- Stopwords -- A complete data science example - text classification -- An overview of unsupervised learning -- K-means -- DBSCAN - a density-based clustering technique -- Latent Dirichlet Allocation (LDA) -- Summary -- Chapter 5: Visualization, Insights, and Results -- Introducing the basics of matplotlib -- Trying curve plotting -- Using panels for clearer representations -- Plotting scatterplots for relationships in data -- Histograms -- Bar graphs -- Image visualization -- Selected graphical examples with pandas -- Working with boxplots and histograms -- Plotting scatterplots -- Discovering patterns by parallel coordinates -- Wrapping up matplotlib's commands -- Introducing Seaborn -- Enhancing your EDA capabilities -- Advanced data learning representation -- Learning curves -- Validation curves -- Feature importance for RandomForests -- Gradient Boosting Trees partial dependence plotting -- Creating a prediction server with machine-learning-as-a-service -- Summary -- Chapter 6: Social Network Analysis -- Introduction to graph theory -- Graph algorithms -- Types of node centrality -- Partitioning a network -- Graph loading, dumping, and sampling -- Summary -- Chapter 7: Deep Learning Beyond the Basics -- Approaching deep learning -- Classifying images with CNN -- Using pre-trained models -- Working with temporal sequences -- Summary -- Chapter 8: Spark for Big Data
Beschreibung:1 Online-Ressource (vi, 453 Seiten) Illustrationen, Diagramme
ISBN:9781789531893

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis