Scalable big data architecture: a practitioner's guide to choosing relevant big data architecture
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
New York
Apress
[2016]
|
Schriftenreihe: | The expert's voice in big data
Books for professionals by professionals |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xiii, 141 Seiten Illustrationen, Diagramme |
ISBN: | 9781484213278 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043335050 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 160202s2016 a||| |||| 00||| eng d | ||
020 | |a 9781484213278 |c pbk |9 978-1-4842-1327-8 | ||
035 | |a (OCoLC)953705364 | ||
035 | |a (DE-599)BVBBV043335050 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-11 |a DE-473 | ||
082 | 0 | |a 004 |2 23 | |
084 | |a ST 265 |0 (DE-625)143634: |2 rvk | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
100 | 1 | |a Azarmi, Bahaaldine |e Verfasser |0 (DE-588)108195924X |4 aut | |
245 | 1 | 0 | |a Scalable big data architecture |b a practitioner's guide to choosing relevant big data architecture |c Bahaaldine Azarmi |
264 | 1 | |a New York |b Apress |c [2016] | |
264 | 4 | |c © 2016 | |
300 | |a xiii, 141 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a The expert's voice in big data | |
490 | 0 | |a Books for professionals by professionals | |
650 | 4 | |a Computer science | |
650 | 4 | |a Database management | |
650 | 4 | |a Data mining | |
650 | 4 | |a Application software | |
650 | 4 | |a Computer Science | |
650 | 4 | |a Computer Appl. in Administrative Data Processing | |
650 | 4 | |a Database Management | |
650 | 4 | |a Data Mining and Knowledge Discovery | |
650 | 4 | |a Informatik | |
650 | 0 | 7 | |a Wissensmanagement |0 (DE-588)4561842-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Big Data |0 (DE-588)4802620-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenbankverwaltung |0 (DE-588)4389357-0 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Big Data |0 (DE-588)4802620-7 |D s |
689 | 0 | 1 | |a Datenbankverwaltung |0 (DE-588)4389357-0 |D s |
689 | 0 | 2 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | 3 | |a Wissensmanagement |0 (DE-588)4561842-2 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-4842-1326-1 |
856 | 4 | 2 | |m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028755119&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-028755119 |
Datensatz im Suchindex
_version_ | 1804175882956308480 |
---|---|
adam_text | Contents
About the Author.......................................................................... xi
About the Technical Reviewers.............................................................xiii
■Chapter 1: The Big (Data) Problem...........................................................1
Identifying Big Data Symptoms.............................................................1
Size Matters...........................................................................1
Typical Business Use Cases.............................................................2
Understanding the Big Data Project’s Ecosystem............................................3
Hadoop Distribution....................................................................3
Data Acquisition.......................................................................6
Processing Language....................................................................7
Machine Learning......................................................................10
NoSQL Stores..........................................................................10
Creating the Foundation of a Long-Term Big Data Architecture.............................12
Architecture Overview.................................................................12
Log Ingestion Application.............................................................13
Learning Application..................................................................13
Processing Engine.....................................................................14
Search Engine.........................................................................15
Summary..................................................................................15
vii
■ CONTENTS
■Chapter 2: Early Big Data with NoSQL................................................ 17
NoSQL Landscape.......................................................................17
Key/Value..........................................................................17
Column.............................................................................18
Document...........................................................................18
Graph..............................................................................19
NoSQL in Our Use Case..............................................................20
Introducing Couchbase.................................................................21
Architecture.......................................................................22
Cluster Manager and Administration Console.........................................24
Managing Documents.................................................................28
Introducing ElasticSearch.............................................................30
Architecture.......................................................................30
Monitoring ElasticSearch...........................................................34
Search with ElasticSearch..........................................................36
Using NoSQL as a Cache in a SQL-based Architecture....................................38
Caching Document...................................................................38
ElasticSearch Plug-in for Couchbase with Couchbase XDCR............................40
ElasticSearch Only.................................................................40
Summary............................................................................. 40
■Chapter 3: Defining the Processing Topology....................................... 41
First Approach to Data Architecture...................................................41
A Little Bit of Background.........................................................41
Dealing with the Data Sources......................................................42
Processing the Data................................................................45
Splitting the Architecture............................................................49
Batch Processing...................................................................50
Stream Processing..................................................................52
The Concept of a Lambda Architecture..................................................53
Summary................................................................................55
viii
CONTENTS
■Chapter 4: Streaming Data.............................................................. 57
Streaming Architecture..................................................................57
Architecture Diagram................................................................57
Technologies........................................................................58
The Anatomy of the Ingested Data........................................................60
Clickstream Data....................................................................60
The Raw Data........................................................................62
The Log Generator...................................................................63
Setting Up the Streaming Architecture...................................................64
Shipping the Logs in Apache Kafka...................................................64
Draining the Logs from Apache Kafka.................................................72
Summary.................................................................................79
■Chapter 5: Querying and Analyzing Patterns...............................................81
Definining an Analytics Strategy........................................................81
Continuous Processing...............................................................81
Real-Time Querying..................................................................82
Process and Index Data Using Spark......................................................82
Preparing the Spark Project.........................................................82
Understanding a Basic Spark Application.............................................84
Implementing the Spark Streamer.....................................................86
Implementing a Spark Indexer........................................................89
Implementing a Spark Data Processing................................................91
Data Analytics with Elasticsearch.......................................................93
Introduction to the aggregation framework...........................................93
Visualize Data in Kibana...............................................................100
Summary................................................................................103
IX
■ CONTENTS
■Chapter 6: Learning From Your Data?..................................................... 105
Introduction to Machine Learning........................................................105
Supervised Learning..................................................................105
Unsupervised Learning................................................................107
Machine Learning with Spark..........................................................108
Adding Machine Learning to Our Architecture..........................................108
Adding Machine Learning to Our Architecture.............................................112
Enriching the Clickstream Data.......................................................112
Labeiizing the Data..................................................................117
Training and Making Prediction.......................................................119
Summary..................................................................................121
■Chapter 7: Governance Considerations.................................................... 123
Dockerizing the Architecture............................................................123
Introducing Docker...................................................................123
Installing Docker....................................................................125
Creating Your Docker Images..........................................................125
Composing the Architecture...........................................................128
Architecture Scalability................................................................132
Sizing and Scaling the Architecture..................................................132
Monitoring the Infrastructure Using the Elastic Stack................................135
Considering Security.................................................................136
Summary..................................................................................137
Index
139
|
any_adam_object | 1 |
author | Azarmi, Bahaaldine |
author_GND | (DE-588)108195924X |
author_facet | Azarmi, Bahaaldine |
author_role | aut |
author_sort | Azarmi, Bahaaldine |
author_variant | b a ba |
building | Verbundindex |
bvnumber | BV043335050 |
classification_rvk | ST 265 ST 530 |
ctrlnum | (OCoLC)953705364 (DE-599)BVBBV043335050 |
dewey-full | 004 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 004 - Computer science |
dewey-raw | 004 |
dewey-search | 004 |
dewey-sort | 14 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02261nam a2200565 c 4500</leader><controlfield tag="001">BV043335050</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160202s2016 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781484213278</subfield><subfield code="c">pbk</subfield><subfield code="9">978-1-4842-1327-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)953705364</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043335050</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-11</subfield><subfield code="a">DE-473</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004</subfield><subfield code="2">23</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 265</subfield><subfield code="0">(DE-625)143634:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Azarmi, Bahaaldine</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)108195924X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Scalable big data architecture</subfield><subfield code="b">a practitioner's guide to choosing relevant big data architecture</subfield><subfield code="c">Bahaaldine Azarmi</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">New York</subfield><subfield code="b">Apress</subfield><subfield code="c">[2016]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xiii, 141 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The expert's voice in big data</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Books for professionals by professionals</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Database management</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data mining</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Application software</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Appl. in Administrative Data Processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Database Management</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data Mining and Knowledge Discovery</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Informatik</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Wissensmanagement</subfield><subfield code="0">(DE-588)4561842-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenbankverwaltung</subfield><subfield code="0">(DE-588)4389357-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenbankverwaltung</subfield><subfield code="0">(DE-588)4389357-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Wissensmanagement</subfield><subfield code="0">(DE-588)4561842-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-4842-1326-1</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028755119&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-028755119</subfield></datafield></record></collection> |
id | DE-604.BV043335050 |
illustrated | Illustrated |
indexdate | 2024-07-10T07:23:21Z |
institution | BVB |
isbn | 9781484213278 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-028755119 |
oclc_num | 953705364 |
open_access_boolean | |
owner | DE-11 DE-473 DE-BY-UBG |
owner_facet | DE-11 DE-473 DE-BY-UBG |
physical | xiii, 141 Seiten Illustrationen, Diagramme |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | Apress |
record_format | marc |
series2 | The expert's voice in big data Books for professionals by professionals |
spelling | Azarmi, Bahaaldine Verfasser (DE-588)108195924X aut Scalable big data architecture a practitioner's guide to choosing relevant big data architecture Bahaaldine Azarmi New York Apress [2016] © 2016 xiii, 141 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier The expert's voice in big data Books for professionals by professionals Computer science Database management Data mining Application software Computer Science Computer Appl. in Administrative Data Processing Database Management Data Mining and Knowledge Discovery Informatik Wissensmanagement (DE-588)4561842-2 gnd rswk-swf Big Data (DE-588)4802620-7 gnd rswk-swf Data Mining (DE-588)4428654-5 gnd rswk-swf Datenbankverwaltung (DE-588)4389357-0 gnd rswk-swf Big Data (DE-588)4802620-7 s Datenbankverwaltung (DE-588)4389357-0 s Data Mining (DE-588)4428654-5 s Wissensmanagement (DE-588)4561842-2 s DE-604 Erscheint auch als Online-Ausgabe 978-1-4842-1326-1 Digitalisierung UB Bamberg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028755119&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Azarmi, Bahaaldine Scalable big data architecture a practitioner's guide to choosing relevant big data architecture Computer science Database management Data mining Application software Computer Science Computer Appl. in Administrative Data Processing Database Management Data Mining and Knowledge Discovery Informatik Wissensmanagement (DE-588)4561842-2 gnd Big Data (DE-588)4802620-7 gnd Data Mining (DE-588)4428654-5 gnd Datenbankverwaltung (DE-588)4389357-0 gnd |
subject_GND | (DE-588)4561842-2 (DE-588)4802620-7 (DE-588)4428654-5 (DE-588)4389357-0 |
title | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture |
title_auth | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture |
title_exact_search | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture |
title_full | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture Bahaaldine Azarmi |
title_fullStr | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture Bahaaldine Azarmi |
title_full_unstemmed | Scalable big data architecture a practitioner's guide to choosing relevant big data architecture Bahaaldine Azarmi |
title_short | Scalable big data architecture |
title_sort | scalable big data architecture a practitioner s guide to choosing relevant big data architecture |
title_sub | a practitioner's guide to choosing relevant big data architecture |
topic | Computer science Database management Data mining Application software Computer Science Computer Appl. in Administrative Data Processing Database Management Data Mining and Knowledge Discovery Informatik Wissensmanagement (DE-588)4561842-2 gnd Big Data (DE-588)4802620-7 gnd Data Mining (DE-588)4428654-5 gnd Datenbankverwaltung (DE-588)4389357-0 gnd |
topic_facet | Computer science Database management Data mining Application software Computer Science Computer Appl. in Administrative Data Processing Database Management Data Mining and Knowledge Discovery Informatik Wissensmanagement Big Data Data Mining Datenbankverwaltung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028755119&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT azarmibahaaldine scalablebigdataarchitectureapractitionersguidetochoosingrelevantbigdataarchitecture |