Thoughtful data science: a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust
Cover -- Copyright -- Packt upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1 - Perspectives on Data Science from a Developer -- What is data science -- Is data science here to stay? -- Why is data science on the rise? -- What does that have to do with developers? -- Putting these...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Birmingham ; Mumbai
Packt
July 2018
|
Schlagworte: | |
Zusammenfassung: | Cover -- Copyright -- Packt upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1 - Perspectives on Data Science from a Developer -- What is data science -- Is data science here to stay? -- Why is data science on the rise? -- What does that have to do with developers? -- Putting these concepts into practice -- Deep diving into a concrete example -- Data pipeline blueprint -- What kind of skills are required to become a data scientist? -- IBM Watson DeepQA -- Back to our sentiment analysis of Twitter hashtags project -- Lessons learned from building our first enterprise-ready data pipeline -- Data science strategy -- Jupyter Notebooks at the center of our strategy -- Why are Notebooks so popular? -- Summary -- Chapter 2 - Data Science at Scale with Jupyter Notebooks and PixieDust -- Why choose Python? -- Introducing PixieDust -- SampleData - a simple API for loading data -- Wrangling data with pixiedust_rosie -- Display - a simple interactive API for data visualization -- Filtering -- Bridging the gap between developers and data scientists with PixieApps -- Architecture for operationalizing data science analytics -- Summary -- Chapter 3 - PixieApp under the Hood -- Anatomy of a PixieApp -- Routes -- Generating requests to routes -- A GitHub project tracking sample application -- Displaying the search results in a table -- Invoking the PixieDust display() API using pd_entity attribute -- Invoking arbitrary Python code with pd_script -- Making the application more responsive with pd_refresh -- Creating reusable widgets -- Summary -- Chapter 4 - Deploying PixieApps to the Web with the PixieGateway Server -- Overview of Kubernetes -- Installing and configuring the PixieGateway server -- PixieGateway server configuration -- PixieGateway architecture -- Publishing an application -- Encoding state in the PixieApp URL Sharing charts by publishing them as web pages -- PixieGateway admin console -- Python Console -- Displaying warmup and run code for a PixieApp -- Summary -- Chapter 5 - Best Practices and Advanced PixieDust Concepts -- Use @captureOutput decorator to integrate the output of third-party Python libraries -- Create a word cloud image with @captureOutput -- Increase modularity and code reuse -- Creating a widget with pd_widget -- PixieDust support of streaming data -- Adding streaming capabilities to your PixieApp -- Adding dashboard drill-downs with PixieApp events -- Extending PixieDust visualizations -- Debugging -- Debugging on the Jupyter Notebook using pdb -- Visual debugging with PixieDebugger -- Debugging PixieApp routes with PixieDebugger -- Troubleshooting issues using PixieDust logging -- Client-side debugging -- Run Node.js inside a Python Notebook -- Summary -- Chapter 6 - Image Recognition with TensorFlow -- What is machine learning? -- What is deep learning? -- Getting started with TensorFlow -- Simple classification with DNNClassifier -- Image recognition sample application -- Part 1 - Load the pretrained MobileNet model -- Part 2 - Create a PixieApp for our image recognition sample application -- Part 3 - Integrate the TensorBoard graph visualization -- Part 4 - Retrain the model with custom training data -- Summary -- Chapter 7 - Big Data Twitter Sentiment Analysis -- Getting started with Apache Spark -- Apache Spark architecture -- Configuring Notebooks to work with Spark -- Twitter sentiment analysis application -- Part 1 - Acquiring the data with Spark Structured Streaming -- Architecture diagram for the data pipeline -- Authentication with Twitter -- Creating the Twitter stream -- Creating a Spark Streaming DataFrame -- Creating and running a structured query -- Monitoring active streaming queries Creating a batch DataFrame from the Parquet files -- Part 2 - Enriching the data with sentiment and most relevant extracted entity -- Getting started with the IBM Watson Natural Language Understanding service -- Part 3 - Creating a real-time dashboard PixieApp -- Refactoring the analytics into their own methods -- Creating the PixieApp -- Part 4 - Adding scalability with Apache Kafka and IBM Streams Designer -- Streaming the raw tweets to Kafka -- Enriching the tweets data with the Streaming Analytics service -- Creating a Spark Streaming DataFrame with a Kafka input source -- Summary -- Chapter 8 - Financial Time Series Analysis and Forecasting -- Getting started with NumPy -- Creating a NumPy array -- Operations on ndarray -- Selections on NumPy arrays -- Broadcasting -- Statistical exploration of time series -- Hypothetical investment -- Autocorrelation function (ACF) and partial autocorrelation function (PACF) -- Putting it all together with the StockExplorer PixieApp -- BaseSubApp - base class for all the child PixieApps -- StockExploreSubApp - first child PixieApp -- MovingAverageSubApp - second child PixieApp -- AutoCorrelationSubApp - third child PixieApp -- Time series forecasting using the ARIMA model -- Build an ARIMA model for the MSFT stock time series -- StockExplorer PixieApp Part 2 - add time series forecasting using the ARIMA model -- Summary -- Chapter 9 - US Domestic Flight Data Analysis Using Graphs -- Introduction to graphs -- Graph representations -- Graph algorithms -- Graph and big data -- Getting started with the networkx graph library -- Creating a graph -- Visualizing a graph -- Part 1 - Loading the US domestic flight data into a graph -- Graph centrality -- Part 2 - Creating the USFlightsAnalysis PixieApp -- Part 3 - Adding data exploration to the USFlightsAnalysis PixieApp |
Beschreibung: | xvi, 467 Seiten Illustrationen, Diagramme |
ISBN: | 9781788839969 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV045422162 | ||
003 | DE-604 | ||
005 | 20190225 | ||
007 | t | ||
008 | 190123s2018 a||| |||| 00||| eng d | ||
020 | |a 9781788839969 |c pbk. |9 978-1-78883-996-9 | ||
035 | |a (OCoLC)1088316461 | ||
035 | |a (DE-599)BVBBV045422162 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-706 | ||
100 | 1 | |a Taieb, David |d ca. 20./21. Jh. |e Verfasser |0 (DE-588)1178741656 |4 aut | |
245 | 1 | 0 | |a Thoughtful data science |b a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust |c David Taieb |
264 | 1 | |a Birmingham ; Mumbai |b Packt |c July 2018 | |
300 | |a xvi, 467 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
520 | 3 | |a Cover -- Copyright -- Packt upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1 - Perspectives on Data Science from a Developer -- What is data science -- Is data science here to stay? -- Why is data science on the rise? -- What does that have to do with developers? -- Putting these concepts into practice -- Deep diving into a concrete example -- Data pipeline blueprint -- What kind of skills are required to become a data scientist? -- IBM Watson DeepQA -- Back to our sentiment analysis of Twitter hashtags project -- Lessons learned from building our first enterprise-ready data pipeline -- Data science strategy -- Jupyter Notebooks at the center of our strategy -- Why are Notebooks so popular? -- Summary -- Chapter 2 - Data Science at Scale with Jupyter Notebooks and PixieDust -- Why choose Python? -- Introducing PixieDust -- SampleData - a simple API for loading data -- Wrangling data with pixiedust_rosie -- Display - a simple interactive API for data visualization -- Filtering -- Bridging the gap between developers and data scientists with PixieApps -- Architecture for operationalizing data science analytics -- Summary -- Chapter 3 - PixieApp under the Hood -- Anatomy of a PixieApp -- Routes -- Generating requests to routes -- A GitHub project tracking sample application -- Displaying the search results in a table -- Invoking the PixieDust display() API using pd_entity attribute -- Invoking arbitrary Python code with pd_script -- Making the application more responsive with pd_refresh -- Creating reusable widgets -- Summary -- Chapter 4 - Deploying PixieApps to the Web with the PixieGateway Server -- Overview of Kubernetes -- Installing and configuring the PixieGateway server -- PixieGateway server configuration -- PixieGateway architecture -- Publishing an application -- Encoding state in the PixieApp URL | |
520 | 3 | |a Sharing charts by publishing them as web pages -- PixieGateway admin console -- Python Console -- Displaying warmup and run code for a PixieApp -- Summary -- Chapter 5 - Best Practices and Advanced PixieDust Concepts -- Use @captureOutput decorator to integrate the output of third-party Python libraries -- Create a word cloud image with @captureOutput -- Increase modularity and code reuse -- Creating a widget with pd_widget -- PixieDust support of streaming data -- Adding streaming capabilities to your PixieApp -- Adding dashboard drill-downs with PixieApp events -- Extending PixieDust visualizations -- Debugging -- Debugging on the Jupyter Notebook using pdb -- Visual debugging with PixieDebugger -- Debugging PixieApp routes with PixieDebugger -- Troubleshooting issues using PixieDust logging -- Client-side debugging -- Run Node.js inside a Python Notebook -- Summary -- Chapter 6 - Image Recognition with TensorFlow -- What is machine learning? -- What is deep learning? -- Getting started with TensorFlow -- Simple classification with DNNClassifier -- Image recognition sample application -- Part 1 - Load the pretrained MobileNet model -- Part 2 - Create a PixieApp for our image recognition sample application -- Part 3 - Integrate the TensorBoard graph visualization -- Part 4 - Retrain the model with custom training data -- Summary -- Chapter 7 - Big Data Twitter Sentiment Analysis -- Getting started with Apache Spark -- Apache Spark architecture -- Configuring Notebooks to work with Spark -- Twitter sentiment analysis application -- Part 1 - Acquiring the data with Spark Structured Streaming -- Architecture diagram for the data pipeline -- Authentication with Twitter -- Creating the Twitter stream -- Creating a Spark Streaming DataFrame -- Creating and running a structured query -- Monitoring active streaming queries | |
520 | 3 | |a Creating a batch DataFrame from the Parquet files -- Part 2 - Enriching the data with sentiment and most relevant extracted entity -- Getting started with the IBM Watson Natural Language Understanding service -- Part 3 - Creating a real-time dashboard PixieApp -- Refactoring the analytics into their own methods -- Creating the PixieApp -- Part 4 - Adding scalability with Apache Kafka and IBM Streams Designer -- Streaming the raw tweets to Kafka -- Enriching the tweets data with the Streaming Analytics service -- Creating a Spark Streaming DataFrame with a Kafka input source -- Summary -- Chapter 8 - Financial Time Series Analysis and Forecasting -- Getting started with NumPy -- Creating a NumPy array -- Operations on ndarray -- Selections on NumPy arrays -- Broadcasting -- Statistical exploration of time series -- Hypothetical investment -- Autocorrelation function (ACF) and partial autocorrelation function (PACF) -- Putting it all together with the StockExplorer PixieApp -- BaseSubApp - base class for all the child PixieApps -- StockExploreSubApp - first child PixieApp -- MovingAverageSubApp - second child PixieApp -- AutoCorrelationSubApp - third child PixieApp -- Time series forecasting using the ARIMA model -- Build an ARIMA model for the MSFT stock time series -- StockExplorer PixieApp Part 2 - add time series forecasting using the ARIMA model -- Summary -- Chapter 9 - US Domestic Flight Data Analysis Using Graphs -- Introduction to graphs -- Graph representations -- Graph algorithms -- Graph and big data -- Getting started with the networkx graph library -- Creating a graph -- Visualizing a graph -- Part 1 - Loading the US domestic flight data into a graph -- Graph centrality -- Part 2 - Creating the USFlightsAnalysis PixieApp -- Part 3 - Adding data exploration to the USFlightsAnalysis PixieApp | |
650 | 0 | 7 | |a Data Science |0 (DE-588)1140936166 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Künstliche Intelligenz |0 (DE-588)4033447-8 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Data Science |0 (DE-588)1140936166 |D s |
689 | 0 | 1 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 2 | |a Künstliche Intelligenz |0 (DE-588)4033447-8 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-78883-043-0 |
999 | |a oai:aleph.bib-bvb.de:BVB01-030808030 |
Datensatz im Suchindex
_version_ | 1804179302590185472 |
---|---|
any_adam_object | |
author | Taieb, David ca. 20./21. Jh |
author_GND | (DE-588)1178741656 |
author_facet | Taieb, David ca. 20./21. Jh |
author_role | aut |
author_sort | Taieb, David ca. 20./21. Jh |
author_variant | d t dt |
building | Verbundindex |
bvnumber | BV045422162 |
ctrlnum | (OCoLC)1088316461 (DE-599)BVBBV045422162 |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>06939nam a2200385 c 4500</leader><controlfield tag="001">BV045422162</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20190225 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">190123s2018 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781788839969</subfield><subfield code="c">pbk.</subfield><subfield code="9">978-1-78883-996-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1088316461</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045422162</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-706</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Taieb, David</subfield><subfield code="d">ca. 20./21. Jh.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1178741656</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Thoughtful data science</subfield><subfield code="b">a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust</subfield><subfield code="c">David Taieb</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham ; Mumbai</subfield><subfield code="b">Packt</subfield><subfield code="c">July 2018</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xvi, 467 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Cover -- Copyright -- Packt upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1 - Perspectives on Data Science from a Developer -- What is data science -- Is data science here to stay? -- Why is data science on the rise? -- What does that have to do with developers? -- Putting these concepts into practice -- Deep diving into a concrete example -- Data pipeline blueprint -- What kind of skills are required to become a data scientist? -- IBM Watson DeepQA -- Back to our sentiment analysis of Twitter hashtags project -- Lessons learned from building our first enterprise-ready data pipeline -- Data science strategy -- Jupyter Notebooks at the center of our strategy -- Why are Notebooks so popular? -- Summary -- Chapter 2 - Data Science at Scale with Jupyter Notebooks and PixieDust -- Why choose Python? -- Introducing PixieDust -- SampleData - a simple API for loading data -- Wrangling data with pixiedust_rosie -- Display - a simple interactive API for data visualization -- Filtering -- Bridging the gap between developers and data scientists with PixieApps -- Architecture for operationalizing data science analytics -- Summary -- Chapter 3 - PixieApp under the Hood -- Anatomy of a PixieApp -- Routes -- Generating requests to routes -- A GitHub project tracking sample application -- Displaying the search results in a table -- Invoking the PixieDust display() API using pd_entity attribute -- Invoking arbitrary Python code with pd_script -- Making the application more responsive with pd_refresh -- Creating reusable widgets -- Summary -- Chapter 4 - Deploying PixieApps to the Web with the PixieGateway Server -- Overview of Kubernetes -- Installing and configuring the PixieGateway server -- PixieGateway server configuration -- PixieGateway architecture -- Publishing an application -- Encoding state in the PixieApp URL</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Sharing charts by publishing them as web pages -- PixieGateway admin console -- Python Console -- Displaying warmup and run code for a PixieApp -- Summary -- Chapter 5 - Best Practices and Advanced PixieDust Concepts -- Use @captureOutput decorator to integrate the output of third-party Python libraries -- Create a word cloud image with @captureOutput -- Increase modularity and code reuse -- Creating a widget with pd_widget -- PixieDust support of streaming data -- Adding streaming capabilities to your PixieApp -- Adding dashboard drill-downs with PixieApp events -- Extending PixieDust visualizations -- Debugging -- Debugging on the Jupyter Notebook using pdb -- Visual debugging with PixieDebugger -- Debugging PixieApp routes with PixieDebugger -- Troubleshooting issues using PixieDust logging -- Client-side debugging -- Run Node.js inside a Python Notebook -- Summary -- Chapter 6 - Image Recognition with TensorFlow -- What is machine learning? -- What is deep learning? -- Getting started with TensorFlow -- Simple classification with DNNClassifier -- Image recognition sample application -- Part 1 - Load the pretrained MobileNet model -- Part 2 - Create a PixieApp for our image recognition sample application -- Part 3 - Integrate the TensorBoard graph visualization -- Part 4 - Retrain the model with custom training data -- Summary -- Chapter 7 - Big Data Twitter Sentiment Analysis -- Getting started with Apache Spark -- Apache Spark architecture -- Configuring Notebooks to work with Spark -- Twitter sentiment analysis application -- Part 1 - Acquiring the data with Spark Structured Streaming -- Architecture diagram for the data pipeline -- Authentication with Twitter -- Creating the Twitter stream -- Creating a Spark Streaming DataFrame -- Creating and running a structured query -- Monitoring active streaming queries</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Creating a batch DataFrame from the Parquet files -- Part 2 - Enriching the data with sentiment and most relevant extracted entity -- Getting started with the IBM Watson Natural Language Understanding service -- Part 3 - Creating a real-time dashboard PixieApp -- Refactoring the analytics into their own methods -- Creating the PixieApp -- Part 4 - Adding scalability with Apache Kafka and IBM Streams Designer -- Streaming the raw tweets to Kafka -- Enriching the tweets data with the Streaming Analytics service -- Creating a Spark Streaming DataFrame with a Kafka input source -- Summary -- Chapter 8 - Financial Time Series Analysis and Forecasting -- Getting started with NumPy -- Creating a NumPy array -- Operations on ndarray -- Selections on NumPy arrays -- Broadcasting -- Statistical exploration of time series -- Hypothetical investment -- Autocorrelation function (ACF) and partial autocorrelation function (PACF) -- Putting it all together with the StockExplorer PixieApp -- BaseSubApp - base class for all the child PixieApps -- StockExploreSubApp - first child PixieApp -- MovingAverageSubApp - second child PixieApp -- AutoCorrelationSubApp - third child PixieApp -- Time series forecasting using the ARIMA model -- Build an ARIMA model for the MSFT stock time series -- StockExplorer PixieApp Part 2 - add time series forecasting using the ARIMA model -- Summary -- Chapter 9 - US Domestic Flight Data Analysis Using Graphs -- Introduction to graphs -- Graph representations -- Graph algorithms -- Graph and big data -- Getting started with the networkx graph library -- Creating a graph -- Visualizing a graph -- Part 1 - Loading the US domestic flight data into a graph -- Graph centrality -- Part 2 - Creating the USFlightsAnalysis PixieApp -- Part 3 - Adding data exploration to the USFlightsAnalysis PixieApp</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Science</subfield><subfield code="0">(DE-588)1140936166</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4033447-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Data Science</subfield><subfield code="0">(DE-588)1140936166</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4033447-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-78883-043-0</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030808030</subfield></datafield></record></collection> |
id | DE-604.BV045422162 |
illustrated | Illustrated |
indexdate | 2024-07-10T08:17:43Z |
institution | BVB |
isbn | 9781788839969 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030808030 |
oclc_num | 1088316461 |
open_access_boolean | |
owner | DE-706 |
owner_facet | DE-706 |
physical | xvi, 467 Seiten Illustrationen, Diagramme |
publishDate | 2018 |
publishDateSearch | 2018 |
publishDateSort | 2018 |
publisher | Packt |
record_format | marc |
spelling | Taieb, David ca. 20./21. Jh. Verfasser (DE-588)1178741656 aut Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust David Taieb Birmingham ; Mumbai Packt July 2018 xvi, 467 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Cover -- Copyright -- Packt upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1 - Perspectives on Data Science from a Developer -- What is data science -- Is data science here to stay? -- Why is data science on the rise? -- What does that have to do with developers? -- Putting these concepts into practice -- Deep diving into a concrete example -- Data pipeline blueprint -- What kind of skills are required to become a data scientist? -- IBM Watson DeepQA -- Back to our sentiment analysis of Twitter hashtags project -- Lessons learned from building our first enterprise-ready data pipeline -- Data science strategy -- Jupyter Notebooks at the center of our strategy -- Why are Notebooks so popular? -- Summary -- Chapter 2 - Data Science at Scale with Jupyter Notebooks and PixieDust -- Why choose Python? -- Introducing PixieDust -- SampleData - a simple API for loading data -- Wrangling data with pixiedust_rosie -- Display - a simple interactive API for data visualization -- Filtering -- Bridging the gap between developers and data scientists with PixieApps -- Architecture for operationalizing data science analytics -- Summary -- Chapter 3 - PixieApp under the Hood -- Anatomy of a PixieApp -- Routes -- Generating requests to routes -- A GitHub project tracking sample application -- Displaying the search results in a table -- Invoking the PixieDust display() API using pd_entity attribute -- Invoking arbitrary Python code with pd_script -- Making the application more responsive with pd_refresh -- Creating reusable widgets -- Summary -- Chapter 4 - Deploying PixieApps to the Web with the PixieGateway Server -- Overview of Kubernetes -- Installing and configuring the PixieGateway server -- PixieGateway server configuration -- PixieGateway architecture -- Publishing an application -- Encoding state in the PixieApp URL Sharing charts by publishing them as web pages -- PixieGateway admin console -- Python Console -- Displaying warmup and run code for a PixieApp -- Summary -- Chapter 5 - Best Practices and Advanced PixieDust Concepts -- Use @captureOutput decorator to integrate the output of third-party Python libraries -- Create a word cloud image with @captureOutput -- Increase modularity and code reuse -- Creating a widget with pd_widget -- PixieDust support of streaming data -- Adding streaming capabilities to your PixieApp -- Adding dashboard drill-downs with PixieApp events -- Extending PixieDust visualizations -- Debugging -- Debugging on the Jupyter Notebook using pdb -- Visual debugging with PixieDebugger -- Debugging PixieApp routes with PixieDebugger -- Troubleshooting issues using PixieDust logging -- Client-side debugging -- Run Node.js inside a Python Notebook -- Summary -- Chapter 6 - Image Recognition with TensorFlow -- What is machine learning? -- What is deep learning? -- Getting started with TensorFlow -- Simple classification with DNNClassifier -- Image recognition sample application -- Part 1 - Load the pretrained MobileNet model -- Part 2 - Create a PixieApp for our image recognition sample application -- Part 3 - Integrate the TensorBoard graph visualization -- Part 4 - Retrain the model with custom training data -- Summary -- Chapter 7 - Big Data Twitter Sentiment Analysis -- Getting started with Apache Spark -- Apache Spark architecture -- Configuring Notebooks to work with Spark -- Twitter sentiment analysis application -- Part 1 - Acquiring the data with Spark Structured Streaming -- Architecture diagram for the data pipeline -- Authentication with Twitter -- Creating the Twitter stream -- Creating a Spark Streaming DataFrame -- Creating and running a structured query -- Monitoring active streaming queries Creating a batch DataFrame from the Parquet files -- Part 2 - Enriching the data with sentiment and most relevant extracted entity -- Getting started with the IBM Watson Natural Language Understanding service -- Part 3 - Creating a real-time dashboard PixieApp -- Refactoring the analytics into their own methods -- Creating the PixieApp -- Part 4 - Adding scalability with Apache Kafka and IBM Streams Designer -- Streaming the raw tweets to Kafka -- Enriching the tweets data with the Streaming Analytics service -- Creating a Spark Streaming DataFrame with a Kafka input source -- Summary -- Chapter 8 - Financial Time Series Analysis and Forecasting -- Getting started with NumPy -- Creating a NumPy array -- Operations on ndarray -- Selections on NumPy arrays -- Broadcasting -- Statistical exploration of time series -- Hypothetical investment -- Autocorrelation function (ACF) and partial autocorrelation function (PACF) -- Putting it all together with the StockExplorer PixieApp -- BaseSubApp - base class for all the child PixieApps -- StockExploreSubApp - first child PixieApp -- MovingAverageSubApp - second child PixieApp -- AutoCorrelationSubApp - third child PixieApp -- Time series forecasting using the ARIMA model -- Build an ARIMA model for the MSFT stock time series -- StockExplorer PixieApp Part 2 - add time series forecasting using the ARIMA model -- Summary -- Chapter 9 - US Domestic Flight Data Analysis Using Graphs -- Introduction to graphs -- Graph representations -- Graph algorithms -- Graph and big data -- Getting started with the networkx graph library -- Creating a graph -- Visualizing a graph -- Part 1 - Loading the US domestic flight data into a graph -- Graph centrality -- Part 2 - Creating the USFlightsAnalysis PixieApp -- Part 3 - Adding data exploration to the USFlightsAnalysis PixieApp Data Science (DE-588)1140936166 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Künstliche Intelligenz (DE-588)4033447-8 gnd rswk-swf Data Science (DE-588)1140936166 s Datenanalyse (DE-588)4123037-1 s Künstliche Intelligenz (DE-588)4033447-8 s DE-604 Erscheint auch als Online-Ausgabe 978-1-78883-043-0 |
spellingShingle | Taieb, David ca. 20./21. Jh Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust Data Science (DE-588)1140936166 gnd Datenanalyse (DE-588)4123037-1 gnd Künstliche Intelligenz (DE-588)4033447-8 gnd |
subject_GND | (DE-588)1140936166 (DE-588)4123037-1 (DE-588)4033447-8 |
title | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust |
title_auth | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust |
title_exact_search | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust |
title_full | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust David Taieb |
title_fullStr | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust David Taieb |
title_full_unstemmed | Thoughtful data science a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust David Taieb |
title_short | Thoughtful data science |
title_sort | thoughtful data science a programmer s toolset for data analysis and artificial intelligence with python jupyter notebook and pixiedust |
title_sub | a programmer's toolset for data analysis and artificial intelligence with Python, Jupyter Notebook, and PixieDust |
topic | Data Science (DE-588)1140936166 gnd Datenanalyse (DE-588)4123037-1 gnd Künstliche Intelligenz (DE-588)4033447-8 gnd |
topic_facet | Data Science Datenanalyse Künstliche Intelligenz |
work_keys_str_mv | AT taiebdavid thoughtfuldatascienceaprogrammerstoolsetfordataanalysisandartificialintelligencewithpythonjupyternotebookandpixiedust |