Applied data science using PySpark: learn the end-to-end predictive model-building cycle
Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science U...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
New York
Apress
[2021]
|
Schlagworte: | |
Zusammenfassung: | Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. You will: Build an end-to-end predictive model ; Implement multiple variable selection techniques ; Operationalize models ; Master multiple algorithms and implementations |
Beschreibung: | Includes index |
Beschreibung: | xxvi, 410 Seiten Illustrationen, Diagramme 26 cm |
ISBN: | 1484264991 9781484264997 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047686647 | ||
003 | DE-604 | ||
005 | 20220204 | ||
007 | t | ||
008 | 220118s2021 a||| |||| 00||| eng d | ||
020 | |a 1484264991 |9 1-4842-6499-1 | ||
020 | |a 9781484264997 |9 978-1-4842-6499-7 | ||
035 | |a (OCoLC)1296327191 | ||
035 | |a (DE-599)BVBBV047686647 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-703 | ||
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
100 | 1 | |a Kakarla, Ramcharan |e Verfasser |0 (DE-588)1227388209 |4 aut | |
245 | 1 | 0 | |a Applied data science using PySpark |b learn the end-to-end predictive model-building cycle |c Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla |
264 | 1 | |a New York |b Apress |c [2021] | |
300 | |a xxvi, 410 Seiten |b Illustrationen, Diagramme |c 26 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Includes index | ||
505 | 8 | |a Chapter 1: Setting up the Pyspark Environment -- Chapter 2: PySpark basics -- Chapter 3: Utility functions and visualizations -- Chapter 4: Variable selection -- Chapter 5: Supervised learning algorithms -- Chapter 6: Model evaluation -- Chapter 7: Unsupervised learning and recommendation algorithms -- Chapter 8: Machine learning flow and automated pipelines -- Chapter 9: Deploying machine learning models -- Appendix: additional resources | |
520 | 3 | |a Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. You will: Build an end-to-end predictive model ; Implement multiple variable selection techniques ; Operationalize models ; Master multiple algorithms and implementations | |
650 | 0 | 7 | |a Big Data |0 (DE-588)4802620-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
653 | 0 | |a Big data | |
653 | 0 | |a Machine learning | |
653 | 0 | |a Computer software | |
653 | 0 | |a Python (Computer program language) | |
653 | 0 | |a Parallel processing (Electronic computers) | |
653 | 0 | |a Python (Computer program language) | |
653 | 0 | |a Parallel processing (Electronic computers) | |
653 | 0 | |a Big data | |
653 | 0 | |a Computer software | |
653 | 0 | |a Machine learning | |
689 | 0 | 0 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 1 | |a Big Data |0 (DE-588)4802620-7 |D s |
689 | 0 | 2 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Krishnan, Sundar |e Sonstige |0 (DE-588)1227389388 |4 oth | |
700 | 1 | |a Alla, Sridhar |e Sonstige |0 (DE-588)1199320188 |4 oth | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-4842-6500-0 |
999 | |a oai:aleph.bib-bvb.de:BVB01-033070671 |
Datensatz im Suchindex
_version_ | 1804183172447993856 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author | Kakarla, Ramcharan |
author_GND | (DE-588)1227388209 (DE-588)1227389388 (DE-588)1199320188 |
author_facet | Kakarla, Ramcharan |
author_role | aut |
author_sort | Kakarla, Ramcharan |
author_variant | r k rk |
building | Verbundindex |
bvnumber | BV047686647 |
classification_rvk | ST 250 |
contents | Chapter 1: Setting up the Pyspark Environment -- Chapter 2: PySpark basics -- Chapter 3: Utility functions and visualizations -- Chapter 4: Variable selection -- Chapter 5: Supervised learning algorithms -- Chapter 6: Model evaluation -- Chapter 7: Unsupervised learning and recommendation algorithms -- Chapter 8: Machine learning flow and automated pipelines -- Chapter 9: Deploying machine learning models -- Appendix: additional resources |
ctrlnum | (OCoLC)1296327191 (DE-599)BVBBV047686647 |
discipline | Informatik |
discipline_str_mv | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>04105nam a2200553 c 4500</leader><controlfield tag="001">BV047686647</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20220204 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">220118s2021 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1484264991</subfield><subfield code="9">1-4842-6499-1</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781484264997</subfield><subfield code="9">978-1-4842-6499-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1296327191</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047686647</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-703</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Kakarla, Ramcharan</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1227388209</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Applied data science using PySpark</subfield><subfield code="b">learn the end-to-end predictive model-building cycle</subfield><subfield code="c">Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">New York</subfield><subfield code="b">Apress</subfield><subfield code="c">[2021]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxvi, 410 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield><subfield code="c">26 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Chapter 1: Setting up the Pyspark Environment -- Chapter 2: PySpark basics -- Chapter 3: Utility functions and visualizations -- Chapter 4: Variable selection -- Chapter 5: Supervised learning algorithms -- Chapter 6: Model evaluation -- Chapter 7: Unsupervised learning and recommendation algorithms -- Chapter 8: Machine learning flow and automated pipelines -- Chapter 9: Deploying machine learning models -- Appendix: additional resources</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. You will: Build an end-to-end predictive model ; Implement multiple variable selection techniques ; Operationalize models ; Master multiple algorithms and implementations</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Big data</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Machine learning</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computer software</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Parallel processing (Electronic computers)</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Parallel processing (Electronic computers)</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Big data</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computer software</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Machine learning</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Krishnan, Sundar</subfield><subfield code="e">Sonstige</subfield><subfield code="0">(DE-588)1227389388</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Alla, Sridhar</subfield><subfield code="e">Sonstige</subfield><subfield code="0">(DE-588)1199320188</subfield><subfield code="4">oth</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-4842-6500-0</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-033070671</subfield></datafield></record></collection> |
id | DE-604.BV047686647 |
illustrated | Illustrated |
index_date | 2024-07-03T18:56:47Z |
indexdate | 2024-07-10T09:19:13Z |
institution | BVB |
isbn | 1484264991 9781484264997 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-033070671 |
oclc_num | 1296327191 |
open_access_boolean | |
owner | DE-703 |
owner_facet | DE-703 |
physical | xxvi, 410 Seiten Illustrationen, Diagramme 26 cm |
publishDate | 2021 |
publishDateSearch | 2021 |
publishDateSort | 2021 |
publisher | Apress |
record_format | marc |
spelling | Kakarla, Ramcharan Verfasser (DE-588)1227388209 aut Applied data science using PySpark learn the end-to-end predictive model-building cycle Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla New York Apress [2021] xxvi, 410 Seiten Illustrationen, Diagramme 26 cm txt rdacontent n rdamedia nc rdacarrier Includes index Chapter 1: Setting up the Pyspark Environment -- Chapter 2: PySpark basics -- Chapter 3: Utility functions and visualizations -- Chapter 4: Variable selection -- Chapter 5: Supervised learning algorithms -- Chapter 6: Model evaluation -- Chapter 7: Unsupervised learning and recommendation algorithms -- Chapter 8: Machine learning flow and automated pipelines -- Chapter 9: Deploying machine learning models -- Appendix: additional resources Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. You will: Build an end-to-end predictive model ; Implement multiple variable selection techniques ; Operationalize models ; Master multiple algorithms and implementations Big Data (DE-588)4802620-7 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Big data Machine learning Computer software Python (Computer program language) Parallel processing (Electronic computers) Datenanalyse (DE-588)4123037-1 s Big Data (DE-588)4802620-7 s Maschinelles Lernen (DE-588)4193754-5 s DE-604 Krishnan, Sundar Sonstige (DE-588)1227389388 oth Alla, Sridhar Sonstige (DE-588)1199320188 oth Erscheint auch als Online-Ausgabe 978-1-4842-6500-0 |
spellingShingle | Kakarla, Ramcharan Applied data science using PySpark learn the end-to-end predictive model-building cycle Chapter 1: Setting up the Pyspark Environment -- Chapter 2: PySpark basics -- Chapter 3: Utility functions and visualizations -- Chapter 4: Variable selection -- Chapter 5: Supervised learning algorithms -- Chapter 6: Model evaluation -- Chapter 7: Unsupervised learning and recommendation algorithms -- Chapter 8: Machine learning flow and automated pipelines -- Chapter 9: Deploying machine learning models -- Appendix: additional resources Big Data (DE-588)4802620-7 gnd Datenanalyse (DE-588)4123037-1 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)4802620-7 (DE-588)4123037-1 (DE-588)4193754-5 |
title | Applied data science using PySpark learn the end-to-end predictive model-building cycle |
title_auth | Applied data science using PySpark learn the end-to-end predictive model-building cycle |
title_exact_search | Applied data science using PySpark learn the end-to-end predictive model-building cycle |
title_exact_search_txtP | Applied data science using PySpark learn the end-to-end predictive model-building cycle |
title_full | Applied data science using PySpark learn the end-to-end predictive model-building cycle Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla |
title_fullStr | Applied data science using PySpark learn the end-to-end predictive model-building cycle Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla |
title_full_unstemmed | Applied data science using PySpark learn the end-to-end predictive model-building cycle Ramcharan Kakarla, Sundar Krishnan, Sridhar Alla |
title_short | Applied data science using PySpark |
title_sort | applied data science using pyspark learn the end to end predictive model building cycle |
title_sub | learn the end-to-end predictive model-building cycle |
topic | Big Data (DE-588)4802620-7 gnd Datenanalyse (DE-588)4123037-1 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | Big Data Datenanalyse Maschinelles Lernen |
work_keys_str_mv | AT kakarlaramcharan applieddatascienceusingpysparklearntheendtoendpredictivemodelbuildingcycle AT krishnansundar applieddatascienceusingpysparklearntheendtoendpredictivemodelbuildingcycle AT allasridhar applieddatascienceusingpysparklearntheendtoendpredictivemodelbuildingcycle |