Practical data analysis cookbook: over 60 practical recipes on data exploration and analysis
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Packt Publ.
2016
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | 365 Seiten |
ISBN: | 9781783551668 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043644796 | ||
003 | DE-604 | ||
005 | 20170620 | ||
007 | t | ||
008 | 160628s2016 |||| 00||| eng d | ||
020 | |a 9781783551668 |9 978-1-78355-166-8 | ||
035 | |a (OCoLC)953525776 | ||
035 | |a (DE-599)BVBBV043644796 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-83 |a DE-739 | ||
084 | |a ST 600 |0 (DE-625)143681: |2 rvk | ||
100 | 1 | |a Drabas, Tomasz |e Verfasser |4 aut | |
245 | 1 | 0 | |a Practical data analysis cookbook |b over 60 practical recipes on data exploration and analysis |
264 | 1 | |b Packt Publ. |c 2016 | |
300 | |a 365 Seiten | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029058535&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-029058535 |
Datensatz im Suchindex
_version_ | 1804176388190633984 |
---|---|
adam_text | Table of Contents
Preface--------------------------------------------—----------
Chapter 1: Preparing thej)ata________________—________________
Introduction
Reading and writing CSV/TSV files with Python
Reading and writing JSON files with Python
Reading and writing Excel files with Python
Reading and writing XML files with Python
Retrieving HTML pages with pandas
Storing and retrieving from a relational database
Storing and retrieving from MongoDB
Opening and transforming data with OpenRefine
Exploring the data with Open Refine
Removing duplicates
Using regular expressions and GREL to clean up data
Imputing missing observations
Normalizing and standardizing the features
Binning the observations
Encoding categorical variables
Chapter 2: Exploring the Data ________________________________
Introduction
Producing descriptive statistics
Exploring correlations between features
Visualizing the interactions between features
Producing histograms
Creating multivariate charts
Sampling the data
Splitting the dataset into training, cross-validation, and testing
_v
_1
2
2
8
9
12
16
19
23
25
28
31
34
36
37
39
41
43
43
43
46
48
54
58
61
63
Chapter 3: Classification Techniques_________________________________________67
Introduction 67
Testing and comparing the models 68
Classifying with Naïve Bayes 71
Using logistic regression as a universal classifier 74
Utilizing Support Vector Machines as a classification engine 79
Classifying calls with decision trees 83
Predicting subscribers with random tree forests 88
Employing neural networks to classify calls 92
Chapter 4: Clustering Techniques_____________________________________________101
Introduction 101
Assessing the performance of a clustering method 102
Clustering data with k-means algorithm 105
Finding an optimal number of clusters for k-means 108
Discovering clusters with mean shift clustering model 115
Building fuzzy clustering model with c-mean$ 116
Using hierarchical model to cluster your data 119
Finding groups of potential subscribers with DBSCAN and
BIRCH algorithms 123
Chapter 5: Reducing Dimensions______________________________________________ 127
introduction 127
Creating three-dimensional scatter plots to present principal components 128
Reducing the dimensions using the kernel version of PCA 131
Using Principal Component Analysis to find things that matter 135
Finding the principal components in your data using randomized PCA 140
Extracting the useful dimensions using Linear Discriminant Analysis 147
Using various dimension reduction 151
techniques to classify calls using the k-Nearest Neighbors classification
model 151
Chapter 6: Regression Methods_______________________________________________ 15Z
Introduction 157
Identifying and tackling multicollinearity 160
Building Linear Regression model 165
Using OLS to forecast how much electricity can be produced 172
Estimating the output of an electric plant using CART 177
Employing the kNN model in a regression problem 181
Applying the Random Forest model to a regression analysis 184
Gauging the amount of electricity a plant can produce using SVMs 187
Training a Neural Network to predict the output of a power plant 194
chapter 7: Time Series Techniques___________________________________________197
Introduction 197
Handling date objects in Python 198
Understanding time series data 203
Smoothing and transforming the observations 208
Filtering the time series data 212
Removing trend and seasonality 216
Forecasting the future with ARMA and ARIMA models 223
Chapter 8: Graphs___________________________________________________________233
Introduction 233
Handling graph objects in Python with NetworkX 234
Using Gephi to visualize graphs 244
Identifying people whose credit card details were stolen 258
Identifying those responsible for stealing the credit cards 263
Chapter 9: Natural Language Processing______________________________________267
Introduction 267
Reading raw text from the Web 268
Tokenizing and normalizing text 274
Identifying parts of speech, handling n-grams, and recognizing
named entities 282
Identifying the topic of an article 289
Identifying the sentence structure 292
Classifying movies based on their reviews 295
Chapter 10: Discrete Choice Models 301
Introduction ; 301
Preparing a dataset to estimate discrete choice models 303
Estimating the well-known Multinomial Logit model 309
Testing for violations of the Independence from Irrelevant Alternatives 316
Handling IIA violations with the Nested Logit model 322
Managing sophisticated substitution patterns with the Mixed Logit model 325
Chapter 11: Simulations_____________________________________________________329
Introduction 329
Using SimPy to simulate the refueling process of a gas station 330
Simulating out-of-energy occurrences for an electric car 342
Determining if a population of sheep is in danger of extinction due to
a wolf pack 348
Index______________________________________________________________________ 359
|
any_adam_object | 1 |
author | Drabas, Tomasz |
author_facet | Drabas, Tomasz |
author_role | aut |
author_sort | Drabas, Tomasz |
author_variant | t d td |
building | Verbundindex |
bvnumber | BV043644796 |
classification_rvk | ST 600 |
ctrlnum | (OCoLC)953525776 (DE-599)BVBBV043644796 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01192nam a2200313 c 4500</leader><controlfield tag="001">BV043644796</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20170620 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160628s2016 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781783551668</subfield><subfield code="9">978-1-78355-166-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)953525776</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043644796</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-83</subfield><subfield code="a">DE-739</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 600</subfield><subfield code="0">(DE-625)143681:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Drabas, Tomasz</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Practical data analysis cookbook</subfield><subfield code="b">over 60 practical recipes on data exploration and analysis</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="b">Packt Publ.</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">365 Seiten</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029058535&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-029058535</subfield></datafield></record></collection> |
id | DE-604.BV043644796 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T07:31:23Z |
institution | BVB |
isbn | 9781783551668 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-029058535 |
oclc_num | 953525776 |
open_access_boolean | |
owner | DE-83 DE-739 |
owner_facet | DE-83 DE-739 |
physical | 365 Seiten |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | Packt Publ. |
record_format | marc |
spelling | Drabas, Tomasz Verfasser aut Practical data analysis cookbook over 60 practical recipes on data exploration and analysis Packt Publ. 2016 365 Seiten txt rdacontent n rdamedia nc rdacarrier Datenanalyse (DE-588)4123037-1 gnd rswk-swf Datenanalyse (DE-588)4123037-1 s DE-604 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029058535&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Drabas, Tomasz Practical data analysis cookbook over 60 practical recipes on data exploration and analysis Datenanalyse (DE-588)4123037-1 gnd |
subject_GND | (DE-588)4123037-1 |
title | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_auth | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_exact_search | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_full | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_fullStr | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_full_unstemmed | Practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_short | Practical data analysis cookbook |
title_sort | practical data analysis cookbook over 60 practical recipes on data exploration and analysis |
title_sub | over 60 practical recipes on data exploration and analysis |
topic | Datenanalyse (DE-588)4123037-1 gnd |
topic_facet | Datenanalyse |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029058535&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT drabastomasz practicaldataanalysiscookbookover60practicalrecipesondataexplorationandanalysis |