Verfügbarkeit: Apache Spark for data science cookbook :

Apache Spark for data science cookbook :: overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /

Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache Spark for data processing with these hands-on recipes Implement end-to-end, large-scale data analysis better than ever before Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Chitturi, Padma Priya (VerfasserIn)
Format:	Elektronisch E-Book
Sprache:	English
Veröffentlicht:	Birmingham, UK : Packt Publishing, 2016.
Schlagworte:	Spark (Electronic resource : Apache Software Foundation) Data mining. Information retrieval. Big data. Data Mining Information Storage and Retrieval Exploration de données (Informatique) Recherche de l'information. Données volumineuses. information retrieval. COMPUTERS / General Big data Data mining Information retrieval
Online-Zugang:	Volltext
Zusammenfassung:	Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache Spark for data processing with these hands-on recipes Implement end-to-end, large-scale data analysis better than ever before Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas to gain insights from your data Who This Book Is For This book is for novice and intermediate level data science professionals and data analysts who want to solve data science problems with a distributed computing framework. Basic experience with data science implementation tasks is expected. Data science professionals looking to skill up and gain an edge in the field will find this book helpful. What You Will Learn Explore the topics of data mining, text mining, Natural Language Processing, information retrieval, and machine learning. Solve real-world analytical problems with large data sets. Address data science challenges with analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale. Get hands-on experience with algorithms like Classification, regression, and recommendation on real datasets using Spark MLLib package. Learn about numerical and scientific computing using NumPy and SciPy on Spark. Use Predictive Model Markup Language (PMML) in Spark for statistical data mining models. In Detail Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark's selling point is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. It lets you tackle the complexities that come with raw unstructured data sets with ease. This guide will get you comfortable and confident performing data science tasks with Spark. You will learn about implementations including distributed deep learning, numerical computing, and scalable machine learning. You will be shown effective solutions to problematic concepts in data science using Spark's data science libraries such as MLLib, Pandas, NumPy, SciPy, and more. These simple and efficient recipes will show you how to implement algorithms and optimize your work. Style and approach This book contains a comprehensive range of recipes designed to help you learn the fundamentals and tackle the difficul...
Beschreibung:	1 online resource (1 volume) : illustrations
ISBN:	9781785288807 1785288806

Internformat

MARC


LEADER	00000cam a2200000 i 4500
001	ZDB-4-EBA-ocn969355608
003	OCoLC
005	20241004212047.0
006	m o d
007	cr unu\|\|\|\|\|\|\|\|
008	170119s2016 enka o 000 0 eng d
040			\|a UMI \|b eng \|e rda \|e pn \|c UMI \|d N$T \|d SCB \|d IDEBK \|d OCLCF \|d TEFOD \|d OCLCQ \|d COO \|d VT2 \|d UOK \|d CEF \|d KSU \|d DEBBG \|d WYU \|d UAB \|d QGK \|d OCLCO \|d OCLCQ \|d OCLCO \|d OCLCQ \|d DXU
020			\|a 9781785288807 \|q (electronic bk.)
020			\|a 1785288806 \|q (electronic bk.)
020			\|z 9781785880100
035			\|a (OCoLC)969355608
037			\|a CL0500000820 \|b Safari Books Online
037			\|a C90D765B-3395-4C39-8E10-649B40A6387F \|b OverDrive, Inc. \|n http://www.overdrive.com
050		4	\|a QA76.9.D343
072		7	\|a COM \|x 000000 \|2 bisacsh
082	7		\|a 006.3/12 \|2 23
049			\|a MAIN
100	1		\|a Chitturi, Padma Priya, \|e author.
245	1	0	\|a Apache Spark for data science cookbook : \|b overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / \|c Padma Priya Chitturi.
264		1	\|a Birmingham, UK : \|b Packt Publishing, \|c 2016.
300			\|a 1 online resource (1 volume) : \|b illustrations
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
588			\|a Description based on online resource; title from cover (Safari, viewed January 17, 2017).
520			\|a Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache Spark for data processing with these hands-on recipes Implement end-to-end, large-scale data analysis better than ever before Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas to gain insights from your data Who This Book Is For This book is for novice and intermediate level data science professionals and data analysts who want to solve data science problems with a distributed computing framework. Basic experience with data science implementation tasks is expected. Data science professionals looking to skill up and gain an edge in the field will find this book helpful. What You Will Learn Explore the topics of data mining, text mining, Natural Language Processing, information retrieval, and machine learning. Solve real-world analytical problems with large data sets. Address data science challenges with analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale. Get hands-on experience with algorithms like Classification, regression, and recommendation on real datasets using Spark MLLib package. Learn about numerical and scientific computing using NumPy and SciPy on Spark. Use Predictive Model Markup Language (PMML) in Spark for statistical data mining models. In Detail Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark's selling point is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. It lets you tackle the complexities that come with raw unstructured data sets with ease. This guide will get you comfortable and confident performing data science tasks with Spark. You will learn about implementations including distributed deep learning, numerical computing, and scalable machine learning. You will be shown effective solutions to problematic concepts in data science using Spark's data science libraries such as MLLib, Pandas, NumPy, SciPy, and more. These simple and efficient recipes will show you how to implement algorithms and optimize your work. Style and approach This book contains a comprehensive range of recipes designed to help you learn the fundamentals and tackle the difficul...
505	0		\|a Cover -- Copyright -- Credits -- About the Author -- About the Reviewer -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Big Data Analytics with Spark -- Introduction -- Initializing SparkContext -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark's Python and Scala shells -- How to do it... -- How it works... -- There's more... -- See also -- Building standalone applications -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with the Spark programming model -- How to do it... -- How it works... -- There's more... -- See also -- Working with pair RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Persisting RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Loading and saving data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating broadcast variables and accumulators -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Submitting applications to a cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark Streaming -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 2: Tricky Statistics with Spark -- Introduction -- Working with Pandas -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sampling data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Summary and descriptive statistics -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Generating frequency tables.
505	8		\|a Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas from source -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Using IPython with PySpark -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating Pandas DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Splitting, slicing, sorting, filtering, and grouping DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing co-variance and correlation using Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Concatenating and merging operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Complex operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sparkling Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 3: Data Analysis with Spark -- Introduction -- Univariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Bivariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Missing value treatment -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Outlier detection -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the MovieLens dataset -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the Uber dataset -- Getting ready -- How to do it... -- How it works... -- There's more....
505	8		\|a See also -- Chapter 4: Clustering, Classification, and Regression -- Introduction -- Supervised learning -- Unsupervised learning -- Applying regression analysis for sales data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying linear regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression on bank marketing data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Real-time intrusion detection using streaming k-means -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Simulating real-time data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying streaming k-means -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 5: Working with Spark MLlib -- Introduction -- Working with Spark ML pipelines -- Implementing Naive Bayes' classification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing decision trees -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Building a recommendation system -- Getting ready -- How to do it... -- How it works....
505	8		\|a There's more... -- See also -- Implementing logistic regression using Spark ML pipelines -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 6: NLP with Spark -- Introduction -- Installing NLTK on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Anaconda on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Anaconda for cluster management -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- POS tagging with PySpark on an Anaconda cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- NER with IPython over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- chunker over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- sentence detector over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing stanford NLP -- lemmatization over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing sentiment analysis using stanford NLP over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 7: Working with Sparkling Water -- H2O -- Introduction -- Features -- Working with H2O on Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing k-means using H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing spam detection with Sparkling Water -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Deep learning with airlines and weather data -- Getting ready -- How to do it... -- How it works....
505	8		\|a There's more... -- See also -- Implementing a crime detection application -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running SVM with H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 8: Data Visualization with Spark -- Introduction -- Visualization using Zeppelin -- Getting ready -- How to do it... -- Installing Zeppelin -- Customizing Zeppelin's server and websocket port -- Visualizing data on HDFS -- parameterizing inputs -- Running custom functions -- Adding external dependencies to Zeppelin -- Pointing to an external Spark Cluster -- How to do it... -- How it works... -- There's more... -- See also -- Creating scatter plots with Bokeh-Scala -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating a time series MultiPlot with Bokeh-Scala -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating plots with the lightning visualization server -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Visualize machine learning models with Databricks notebook -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 9: Deep Learning on Spark -- Introduction -- Installing CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a feed-forward neural network with DeepLearning 4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running an RBM with DeepLearning4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a CNN for learning MNIST with DeepLearning4j over Spark -- Getting ready -- How to do it....
630	0	0	\|a Spark (Electronic resource : Apache Software Foundation) \|0 http://id.loc.gov/authorities/names/no2015027445
630	0	7	\|a Spark (Electronic resource : Apache Software Foundation) \|2 fast
650		0	\|a Data mining. \|0 http://id.loc.gov/authorities/subjects/sh97002073
650		0	\|a Information retrieval. \|0 http://id.loc.gov/authorities/subjects/sh85066148
650		0	\|a Big data. \|0 http://id.loc.gov/authorities/subjects/sh2012003227
650		2	\|a Data Mining \|0 https://id.nlm.nih.gov/mesh/D057225
650		2	\|a Information Storage and Retrieval \|0 https://id.nlm.nih.gov/mesh/D016247
650		6	\|a Exploration de données (Informatique)
650		6	\|a Recherche de l'information.
650		6	\|a Données volumineuses.
650		7	\|a information retrieval. \|2 aat
650		7	\|a COMPUTERS / General \|2 bisacsh
650		7	\|a Big data \|2 fast
650		7	\|a Data mining \|2 fast
650		7	\|a Information retrieval \|2 fast
856	4	0	\|l FWS01 \|p ZDB-4-EBA \|q FWS_PDA_EBA \|u https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1444391 \|3 Volltext
938			\|a ProQuest MyiLibrary Digital eBook Collection \|b IDEB \|n cis34515041
938			\|a EBSCOhost \|b EBSC \|n 1444391
994			\|a 92 \|b GEBAY
912			\|a ZDB-4-EBA
049			\|a DE-863

Datensatz im Suchindex

DE-BY-FWS_katkey	ZDB-4-EBA-ocn969355608
_version_	1816882376110768128
adam_text
any_adam_object
author	Chitturi, Padma Priya
author_facet	Chitturi, Padma Priya
author_role	aut
author_sort	Chitturi, Padma Priya
author_variant	p p c pp ppc
building	Verbundindex
bvnumber	localFWS
callnumber-first	Q - Science
callnumber-label	QA76
callnumber-raw	QA76.9.D343
callnumber-search	QA76.9.D343
callnumber-sort	QA 276.9 D343
callnumber-subject	QA - Mathematics
collection	ZDB-4-EBA
contents	Cover -- Copyright -- Credits -- About the Author -- About the Reviewer -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Big Data Analytics with Spark -- Introduction -- Initializing SparkContext -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark's Python and Scala shells -- How to do it... -- How it works... -- There's more... -- See also -- Building standalone applications -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with the Spark programming model -- How to do it... -- How it works... -- There's more... -- See also -- Working with pair RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Persisting RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Loading and saving data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating broadcast variables and accumulators -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Submitting applications to a cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark Streaming -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 2: Tricky Statistics with Spark -- Introduction -- Working with Pandas -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sampling data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Summary and descriptive statistics -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Generating frequency tables. Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas from source -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Using IPython with PySpark -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating Pandas DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Splitting, slicing, sorting, filtering, and grouping DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing co-variance and correlation using Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Concatenating and merging operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Complex operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sparkling Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 3: Data Analysis with Spark -- Introduction -- Univariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Bivariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Missing value treatment -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Outlier detection -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the MovieLens dataset -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the Uber dataset -- Getting ready -- How to do it... -- How it works... -- There's more.... See also -- Chapter 4: Clustering, Classification, and Regression -- Introduction -- Supervised learning -- Unsupervised learning -- Applying regression analysis for sales data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying linear regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression on bank marketing data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Real-time intrusion detection using streaming k-means -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Simulating real-time data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying streaming k-means -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 5: Working with Spark MLlib -- Introduction -- Working with Spark ML pipelines -- Implementing Naive Bayes' classification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing decision trees -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Building a recommendation system -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing logistic regression using Spark ML pipelines -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 6: NLP with Spark -- Introduction -- Installing NLTK on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Anaconda on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Anaconda for cluster management -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- POS tagging with PySpark on an Anaconda cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- NER with IPython over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- chunker over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- sentence detector over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing stanford NLP -- lemmatization over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing sentiment analysis using stanford NLP over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 7: Working with Sparkling Water -- H2O -- Introduction -- Features -- Working with H2O on Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing k-means using H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing spam detection with Sparkling Water -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Deep learning with airlines and weather data -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing a crime detection application -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running SVM with H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 8: Data Visualization with Spark -- Introduction -- Visualization using Zeppelin -- Getting ready -- How to do it... -- Installing Zeppelin -- Customizing Zeppelin's server and websocket port -- Visualizing data on HDFS -- parameterizing inputs -- Running custom functions -- Adding external dependencies to Zeppelin -- Pointing to an external Spark Cluster -- How to do it... -- How it works... -- There's more... -- See also -- Creating scatter plots with Bokeh-Scala -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating a time series MultiPlot with Bokeh-Scala -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating plots with the lightning visualization server -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Visualize machine learning models with Databricks notebook -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 9: Deep Learning on Spark -- Introduction -- Installing CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a feed-forward neural network with DeepLearning 4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running an RBM with DeepLearning4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a CNN for learning MNIST with DeepLearning4j over Spark -- Getting ready -- How to do it....
ctrlnum	(OCoLC)969355608
dewey-full	006.3/12
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	006 - Special computer methods
dewey-raw	006.3/12
dewey-search	006.3/12
dewey-sort	16.3 212
dewey-tens	000 - Computer science, information, general works
discipline	Informatik
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>14914cam a2200625 i 4500</leader><controlfield tag="001">ZDB-4-EBA-ocn969355608</controlfield><controlfield tag="003">OCoLC</controlfield><controlfield tag="005">20241004212047.0</controlfield><controlfield tag="006">m o d </controlfield><controlfield tag="007">cr unu\|\|\|\|\|\|\|\|</controlfield><controlfield tag="008">170119s2016 enka o 000 0 eng d</controlfield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">UMI</subfield><subfield code="b">eng</subfield><subfield code="e">rda</subfield><subfield code="e">pn</subfield><subfield code="c">UMI</subfield><subfield code="d">N$T</subfield><subfield code="d">SCB</subfield><subfield code="d">IDEBK</subfield><subfield code="d">OCLCF</subfield><subfield code="d">TEFOD</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">COO</subfield><subfield code="d">VT2</subfield><subfield code="d">UOK</subfield><subfield code="d">CEF</subfield><subfield code="d">KSU</subfield><subfield code="d">DEBBG</subfield><subfield code="d">WYU</subfield><subfield code="d">UAB</subfield><subfield code="d">QGK</subfield><subfield code="d">OCLCO</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">OCLCO</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">DXU</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781785288807</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1785288806</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9781785880100</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)969355608</subfield></datafield><datafield tag="037" ind1=" " ind2=" "><subfield code="a">CL0500000820</subfield><subfield code="b">Safari Books Online</subfield></datafield><datafield tag="037" ind1=" " ind2=" "><subfield code="a">C90D765B-3395-4C39-8E10-649B40A6387F</subfield><subfield code="b">OverDrive, Inc.</subfield><subfield code="n">http://www.overdrive.com</subfield></datafield><datafield tag="050" ind1=" " ind2="4"><subfield code="a">QA76.9.D343</subfield></datafield><datafield tag="072" ind1=" " ind2="7"><subfield code="a">COM</subfield><subfield code="x">000000</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="082" ind1="7" ind2=" "><subfield code="a">006.3/12</subfield><subfield code="2">23</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">MAIN</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chitturi, Padma Priya,</subfield><subfield code="e">author.</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Apache Spark for data science cookbook :</subfield><subfield code="b">overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /</subfield><subfield code="c">Padma Priya Chitturi.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham, UK :</subfield><subfield code="b">Packt Publishing,</subfield><subfield code="c">2016.</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 online resource (1 volume) :</subfield><subfield code="b">illustrations</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">computer</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">online resource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="588" ind1=" " ind2=" "><subfield code="a">Description based on online resource; title from cover (Safari, viewed January 17, 2017).</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache Spark for data processing with these hands-on recipes Implement end-to-end, large-scale data analysis better than ever before Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas to gain insights from your data Who This Book Is For This book is for novice and intermediate level data science professionals and data analysts who want to solve data science problems with a distributed computing framework. Basic experience with data science implementation tasks is expected. Data science professionals looking to skill up and gain an edge in the field will find this book helpful. What You Will Learn Explore the topics of data mining, text mining, Natural Language Processing, information retrieval, and machine learning. Solve real-world analytical problems with large data sets. Address data science challenges with analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale. Get hands-on experience with algorithms like Classification, regression, and recommendation on real datasets using Spark MLLib package. Learn about numerical and scientific computing using NumPy and SciPy on Spark. Use Predictive Model Markup Language (PMML) in Spark for statistical data mining models. In Detail Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark's selling point is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. It lets you tackle the complexities that come with raw unstructured data sets with ease. This guide will get you comfortable and confident performing data science tasks with Spark. You will learn about implementations including distributed deep learning, numerical computing, and scalable machine learning. You will be shown effective solutions to problematic concepts in data science using Spark's data science libraries such as MLLib, Pandas, NumPy, SciPy, and more. These simple and efficient recipes will show you how to implement algorithms and optimize your work. Style and approach This book contains a comprehensive range of recipes designed to help you learn the fundamentals and tackle the difficul...</subfield></datafield><datafield tag="505" ind1="0" ind2=" "><subfield code="a">Cover -- Copyright -- Credits -- About the Author -- About the Reviewer -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Big Data Analytics with Spark -- Introduction -- Initializing SparkContext -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark's Python and Scala shells -- How to do it... -- How it works... -- There's more... -- See also -- Building standalone applications -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with the Spark programming model -- How to do it... -- How it works... -- There's more... -- See also -- Working with pair RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Persisting RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Loading and saving data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating broadcast variables and accumulators -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Submitting applications to a cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark Streaming -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 2: Tricky Statistics with Spark -- Introduction -- Working with Pandas -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sampling data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Summary and descriptive statistics -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Generating frequency tables.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas from source -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Using IPython with PySpark -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating Pandas DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Splitting, slicing, sorting, filtering, and grouping DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing co-variance and correlation using Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Concatenating and merging operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Complex operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sparkling Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 3: Data Analysis with Spark -- Introduction -- Univariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Bivariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Missing value treatment -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Outlier detection -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the MovieLens dataset -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the Uber dataset -- Getting ready -- How to do it... -- How it works... -- There's more....</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">See also -- Chapter 4: Clustering, Classification, and Regression -- Introduction -- Supervised learning -- Unsupervised learning -- Applying regression analysis for sales data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying linear regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression on bank marketing data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Real-time intrusion detection using streaming k-means -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Simulating real-time data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying streaming k-means -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 5: Working with Spark MLlib -- Introduction -- Working with Spark ML pipelines -- Implementing Naive Bayes' classification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing decision trees -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Building a recommendation system -- Getting ready -- How to do it... -- How it works....</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">There's more... -- See also -- Implementing logistic regression using Spark ML pipelines -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 6: NLP with Spark -- Introduction -- Installing NLTK on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Anaconda on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Anaconda for cluster management -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- POS tagging with PySpark on an Anaconda cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- NER with IPython over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- chunker over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- sentence detector over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing stanford NLP -- lemmatization over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing sentiment analysis using stanford NLP over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 7: Working with Sparkling Water -- H2O -- Introduction -- Features -- Working with H2O on Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing k-means using H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing spam detection with Sparkling Water -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Deep learning with airlines and weather data -- Getting ready -- How to do it... -- How it works....</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">There's more... -- See also -- Implementing a crime detection application -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running SVM with H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 8: Data Visualization with Spark -- Introduction -- Visualization using Zeppelin -- Getting ready -- How to do it... -- Installing Zeppelin -- Customizing Zeppelin's server and websocket port -- Visualizing data on HDFS -- parameterizing inputs -- Running custom functions -- Adding external dependencies to Zeppelin -- Pointing to an external Spark Cluster -- How to do it... -- How it works... -- There's more... -- See also -- Creating scatter plots with Bokeh-Scala -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating a time series MultiPlot with Bokeh-Scala -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating plots with the lightning visualization server -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Visualize machine learning models with Databricks notebook -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 9: Deep Learning on Spark -- Introduction -- Installing CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a feed-forward neural network with DeepLearning 4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running an RBM with DeepLearning4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a CNN for learning MNIST with DeepLearning4j over Spark -- Getting ready -- How to do it....</subfield></datafield><datafield tag="630" ind1="0" ind2="0"><subfield code="a">Spark (Electronic resource : Apache Software Foundation)</subfield><subfield code="0">http://id.loc.gov/authorities/names/no2015027445</subfield></datafield><datafield tag="630" ind1="0" ind2="7"><subfield code="a">Spark (Electronic resource : Apache Software Foundation)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Data mining.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh97002073</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Information retrieval.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh85066148</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Big data.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh2012003227</subfield></datafield><datafield tag="650" ind1=" " ind2="2"><subfield code="a">Data Mining</subfield><subfield code="0">https://id.nlm.nih.gov/mesh/D057225</subfield></datafield><datafield tag="650" ind1=" " ind2="2"><subfield code="a">Information Storage and Retrieval</subfield><subfield code="0">https://id.nlm.nih.gov/mesh/D016247</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Exploration de données (Informatique)</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Recherche de l'information.</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Données volumineuses.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">information retrieval.</subfield><subfield code="2">aat</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">COMPUTERS / General</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Big data</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Data mining</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Information retrieval</subfield><subfield code="2">fast</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="l">FWS01</subfield><subfield code="p">ZDB-4-EBA</subfield><subfield code="q">FWS_PDA_EBA</subfield><subfield code="u">https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1444391</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">ProQuest MyiLibrary Digital eBook Collection</subfield><subfield code="b">IDEB</subfield><subfield code="n">cis34515041</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">EBSCOhost</subfield><subfield code="b">EBSC</subfield><subfield code="n">1444391</subfield></datafield><datafield tag="994" ind1=" " ind2=" "><subfield code="a">92</subfield><subfield code="b">GEBAY</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-4-EBA</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-863</subfield></datafield></record></collection>
id	ZDB-4-EBA-ocn969355608
illustrated	Illustrated
indexdate	2024-11-27T13:27:37Z
institution	BVB
isbn	9781785288807 1785288806
language	English
oclc_num	969355608
open_access_boolean
owner	MAIN DE-863 DE-BY-FWS
owner_facet	MAIN DE-863 DE-BY-FWS
physical	1 online resource (1 volume) : illustrations
psigel	ZDB-4-EBA
publishDate	2016
publishDateSearch	2016
publishDateSort	2016
publisher	Packt Publishing,
record_format	marc
spelling	Chitturi, Padma Priya, author. Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / Padma Priya Chitturi. Birmingham, UK : Packt Publishing, 2016. 1 online resource (1 volume) : illustrations text txt rdacontent computer c rdamedia online resource cr rdacarrier Description based on online resource; title from cover (Safari, viewed January 17, 2017). Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache Spark for data processing with these hands-on recipes Implement end-to-end, large-scale data analysis better than ever before Work with powerful libraries such as MLLib, SciPy, NumPy, and Pandas to gain insights from your data Who This Book Is For This book is for novice and intermediate level data science professionals and data analysts who want to solve data science problems with a distributed computing framework. Basic experience with data science implementation tasks is expected. Data science professionals looking to skill up and gain an edge in the field will find this book helpful. What You Will Learn Explore the topics of data mining, text mining, Natural Language Processing, information retrieval, and machine learning. Solve real-world analytical problems with large data sets. Address data science challenges with analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale. Get hands-on experience with algorithms like Classification, regression, and recommendation on real datasets using Spark MLLib package. Learn about numerical and scientific computing using NumPy and SciPy on Spark. Use Predictive Model Markup Language (PMML) in Spark for statistical data mining models. In Detail Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark's selling point is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. It lets you tackle the complexities that come with raw unstructured data sets with ease. This guide will get you comfortable and confident performing data science tasks with Spark. You will learn about implementations including distributed deep learning, numerical computing, and scalable machine learning. You will be shown effective solutions to problematic concepts in data science using Spark's data science libraries such as MLLib, Pandas, NumPy, SciPy, and more. These simple and efficient recipes will show you how to implement algorithms and optimize your work. Style and approach This book contains a comprehensive range of recipes designed to help you learn the fundamentals and tackle the difficul... Cover -- Copyright -- Credits -- About the Author -- About the Reviewer -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Big Data Analytics with Spark -- Introduction -- Initializing SparkContext -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark's Python and Scala shells -- How to do it... -- How it works... -- There's more... -- See also -- Building standalone applications -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with the Spark programming model -- How to do it... -- How it works... -- There's more... -- See also -- Working with pair RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Persisting RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Loading and saving data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating broadcast variables and accumulators -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Submitting applications to a cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark Streaming -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 2: Tricky Statistics with Spark -- Introduction -- Working with Pandas -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sampling data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Summary and descriptive statistics -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Generating frequency tables. Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas from source -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Using IPython with PySpark -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating Pandas DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Splitting, slicing, sorting, filtering, and grouping DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing co-variance and correlation using Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Concatenating and merging operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Complex operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sparkling Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 3: Data Analysis with Spark -- Introduction -- Univariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Bivariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Missing value treatment -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Outlier detection -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the MovieLens dataset -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the Uber dataset -- Getting ready -- How to do it... -- How it works... -- There's more.... See also -- Chapter 4: Clustering, Classification, and Regression -- Introduction -- Supervised learning -- Unsupervised learning -- Applying regression analysis for sales data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying linear regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression on bank marketing data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Real-time intrusion detection using streaming k-means -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Simulating real-time data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying streaming k-means -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 5: Working with Spark MLlib -- Introduction -- Working with Spark ML pipelines -- Implementing Naive Bayes' classification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing decision trees -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Building a recommendation system -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing logistic regression using Spark ML pipelines -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 6: NLP with Spark -- Introduction -- Installing NLTK on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Anaconda on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Anaconda for cluster management -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- POS tagging with PySpark on an Anaconda cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- NER with IPython over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- chunker over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- sentence detector over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing stanford NLP -- lemmatization over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing sentiment analysis using stanford NLP over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 7: Working with Sparkling Water -- H2O -- Introduction -- Features -- Working with H2O on Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing k-means using H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing spam detection with Sparkling Water -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Deep learning with airlines and weather data -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing a crime detection application -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running SVM with H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 8: Data Visualization with Spark -- Introduction -- Visualization using Zeppelin -- Getting ready -- How to do it... -- Installing Zeppelin -- Customizing Zeppelin's server and websocket port -- Visualizing data on HDFS -- parameterizing inputs -- Running custom functions -- Adding external dependencies to Zeppelin -- Pointing to an external Spark Cluster -- How to do it... -- How it works... -- There's more... -- See also -- Creating scatter plots with Bokeh-Scala -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating a time series MultiPlot with Bokeh-Scala -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating plots with the lightning visualization server -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Visualize machine learning models with Databricks notebook -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 9: Deep Learning on Spark -- Introduction -- Installing CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a feed-forward neural network with DeepLearning 4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running an RBM with DeepLearning4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a CNN for learning MNIST with DeepLearning4j over Spark -- Getting ready -- How to do it.... Spark (Electronic resource : Apache Software Foundation) http://id.loc.gov/authorities/names/no2015027445 Spark (Electronic resource : Apache Software Foundation) fast Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Information retrieval. http://id.loc.gov/authorities/subjects/sh85066148 Big data. http://id.loc.gov/authorities/subjects/sh2012003227 Data Mining https://id.nlm.nih.gov/mesh/D057225 Information Storage and Retrieval https://id.nlm.nih.gov/mesh/D016247 Exploration de données (Informatique) Recherche de l'information. Données volumineuses. information retrieval. aat COMPUTERS / General bisacsh Big data fast Data mining fast Information retrieval fast FWS01 ZDB-4-EBA FWS_PDA_EBA https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1444391 Volltext
spellingShingle	Chitturi, Padma Priya Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / Cover -- Copyright -- Credits -- About the Author -- About the Reviewer -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Big Data Analytics with Spark -- Introduction -- Initializing SparkContext -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark's Python and Scala shells -- How to do it... -- How it works... -- There's more... -- See also -- Building standalone applications -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with the Spark programming model -- How to do it... -- How it works... -- There's more... -- See also -- Working with pair RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Persisting RDDs -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Loading and saving data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating broadcast variables and accumulators -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Submitting applications to a cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with Spark Streaming -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 2: Tricky Statistics with Spark -- Introduction -- Working with Pandas -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sampling data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Summary and descriptive statistics -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Generating frequency tables. Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Pandas from source -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Using IPython with PySpark -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating Pandas DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Splitting, slicing, sorting, filtering, and grouping DataFrames over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing co-variance and correlation using Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Concatenating and merging operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Complex operations over DataFrames -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Sparkling Pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 3: Data Analysis with Spark -- Introduction -- Univariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Bivariate analysis -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Missing value treatment -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Outlier detection -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the MovieLens dataset -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Use case -- analyzing the Uber dataset -- Getting ready -- How to do it... -- How it works... -- There's more.... See also -- Chapter 4: Clustering, Classification, and Regression -- Introduction -- Supervised learning -- Unsupervised learning -- Applying regression analysis for sales data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying linear regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression on bank marketing data -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Data exploration -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Feature engineering -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying logistic regression -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Real-time intrusion detection using streaming k-means -- Variable identification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Simulating real-time data -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Applying streaming k-means -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 5: Working with Spark MLlib -- Introduction -- Working with Spark ML pipelines -- Implementing Naive Bayes' classification -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing decision trees -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Building a recommendation system -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing logistic regression using Spark ML pipelines -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 6: NLP with Spark -- Introduction -- Installing NLTK on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Installing Anaconda on Linux -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Anaconda for cluster management -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- POS tagging with PySpark on an Anaconda cluster -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- NER with IPython over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- chunker over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing openNLP -- sentence detector over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing stanford NLP -- lemmatization over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing sentiment analysis using stanford NLP over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 7: Working with Sparkling Water -- H2O -- Introduction -- Features -- Working with H2O on Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing k-means using H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Implementing spam detection with Sparkling Water -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Deep learning with airlines and weather data -- Getting ready -- How to do it... -- How it works.... There's more... -- See also -- Implementing a crime detection application -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running SVM with H2O over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 8: Data Visualization with Spark -- Introduction -- Visualization using Zeppelin -- Getting ready -- How to do it... -- Installing Zeppelin -- Customizing Zeppelin's server and websocket port -- Visualizing data on HDFS -- parameterizing inputs -- Running custom functions -- Adding external dependencies to Zeppelin -- Pointing to an external Spark Cluster -- How to do it... -- How it works... -- There's more... -- See also -- Creating scatter plots with Bokeh-Scala -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Creating a time series MultiPlot with Bokeh-Scala -- Getting ready -- How to do it... -- How it work... -- There's more... -- See also -- Creating plots with the lightning visualization server -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Visualize machine learning models with Databricks notebook -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Chapter 9: Deep Learning on Spark -- Introduction -- Installing CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Working with CaffeOnSpark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a feed-forward neural network with DeepLearning 4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running an RBM with DeepLearning4j over Spark -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Running a CNN for learning MNIST with DeepLearning4j over Spark -- Getting ready -- How to do it.... Spark (Electronic resource : Apache Software Foundation) http://id.loc.gov/authorities/names/no2015027445 Spark (Electronic resource : Apache Software Foundation) fast Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Information retrieval. http://id.loc.gov/authorities/subjects/sh85066148 Big data. http://id.loc.gov/authorities/subjects/sh2012003227 Data Mining https://id.nlm.nih.gov/mesh/D057225 Information Storage and Retrieval https://id.nlm.nih.gov/mesh/D016247 Exploration de données (Informatique) Recherche de l'information. Données volumineuses. information retrieval. aat COMPUTERS / General bisacsh Big data fast Data mining fast Information retrieval fast
subject_GND	http://id.loc.gov/authorities/names/no2015027445 http://id.loc.gov/authorities/subjects/sh97002073 http://id.loc.gov/authorities/subjects/sh85066148 http://id.loc.gov/authorities/subjects/sh2012003227 https://id.nlm.nih.gov/mesh/D057225 https://id.nlm.nih.gov/mesh/D016247
title	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /
title_auth	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /
title_exact_search	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /
title_full	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / Padma Priya Chitturi.
title_fullStr	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / Padma Priya Chitturi.
title_full_unstemmed	Apache Spark for data science cookbook : overinsightful 90 recipes to get lightning-fast analytics with Apache Spark / Padma Priya Chitturi.
title_short	Apache Spark for data science cookbook :
title_sort	apache spark for data science cookbook overinsightful 90 recipes to get lightning fast analytics with apache spark
title_sub	overinsightful 90 recipes to get lightning-fast analytics with Apache Spark /
topic	Spark (Electronic resource : Apache Software Foundation) http://id.loc.gov/authorities/names/no2015027445 Spark (Electronic resource : Apache Software Foundation) fast Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Information retrieval. http://id.loc.gov/authorities/subjects/sh85066148 Big data. http://id.loc.gov/authorities/subjects/sh2012003227 Data Mining https://id.nlm.nih.gov/mesh/D057225 Information Storage and Retrieval https://id.nlm.nih.gov/mesh/D016247 Exploration de données (Informatique) Recherche de l'information. Données volumineuses. information retrieval. aat COMPUTERS / General bisacsh Big data fast Data mining fast Information retrieval fast
topic_facet	Spark (Electronic resource : Apache Software Foundation) Data mining. Information retrieval. Big data. Data Mining Information Storage and Retrieval Exploration de données (Informatique) Recherche de l'information. Données volumineuses. information retrieval. COMPUTERS / General Big data Data mining Information retrieval
url	https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1444391
work_keys_str_mv	AT chitturipadmapriya apachesparkfordatasciencecookbookoverinsightful90recipestogetlightningfastanalyticswithapachespark

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Volltext öffnen

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge