Architecting data and machine learning platforms: enable analytics and AI-driven innovation in the cloud
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Sebastopol, CA
O'Reilly Media
2023
|
Ausgabe: | First edition |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Includes index Hier auch später erschienene, unveränderte Nachdrucke |
Beschreibung: | xviii, 338 Seiten Illustrationen, Diagramme |
ISBN: | 9781098151614 |
Internformat
MARC
LEADER | 00000nam a22000001c 4500 | ||
---|---|---|---|
001 | BV049723902 | ||
003 | DE-604 | ||
005 | 20240930 | ||
007 | t | ||
008 | 240531s2023 a||| |||| 00||| eng d | ||
020 | |a 9781098151614 |9 978-1-09-815161-4 | ||
035 | |a (OCoLC)1446257775 | ||
035 | |a (DE-599)BVBBV049723902 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-739 |a DE-634 |a DE-898 | ||
084 | |a ST 205 |0 (DE-625)143613: |2 rvk | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
100 | 1 | |a Tranquillin, Marco |d ca. 20./21. Jh. |e Verfasser |0 (DE-588)1335109048 |4 aut | |
245 | 1 | 0 | |a Architecting data and machine learning platforms |b enable analytics and AI-driven innovation in the cloud |c Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner |
250 | |a First edition | ||
264 | 1 | |a Sebastopol, CA |b O'Reilly Media |c 2023 | |
300 | |a xviii, 338 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Includes index | ||
500 | |a Hier auch später erschienene, unveränderte Nachdrucke | ||
650 | 0 | 7 | |a Cloud Computing |0 (DE-588)7623494-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenverarbeitung |0 (DE-588)4011152-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Datenverarbeitung |0 (DE-588)4011152-0 |D s |
689 | 0 | 1 | |a Cloud Computing |0 (DE-588)7623494-0 |D s |
689 | 0 | 2 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Lakshmanan, Valliappa |e Verfasser |0 (DE-588)1222923750 |4 aut | |
700 | 1 | |a Tekiner, Firat |d ca. 20./21. Jh. |e Verfasser |0 (DE-588)1335111069 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-09-815158-4 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=035066223&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-035066223 |
Datensatz im Suchindex
_version_ | 1815416429327941632 |
---|---|
adam_text |
Table of Contents Preface. xi 1. Modernizing Your Data Platform: An Introductory Overview. 1 The Data Lifecycle The Journey to Wisdom Water Pipes Analogy Collect Store Process/Transform Analyze/Visualize Activate Limitations of Traditional Approaches Antipattern: Breaking Down Silos Through ETL Antipattern: Centralization of Control Antipattern: Data Marts and Hadoop Creating a Unified Analytics Platform Cloud Instead of On-Premises Drawbacks of Data Marts and Data Lakes Convergence of DWHs and Data Lakes Hybrid Cloud Reasons Why Hybrid Is Necessary Challenges of Hybrid Cloud Why Hybrid Can Work Edge Computing Applying AI Machine Learning Uses of ML Why Cloud for AI? 2 2 3 4 5 7 8 9 10 10 13 15 16 17 18 19 23 24 25 26 27 29 29 30 31 iii
Cloud Infrastructure Democratization Real Time MLOps Core Principles Summary 31 32 34 35 36 38 2. Strategic Steps to Innovate with Data. 41 Step 1: Strategy and Planning Strategic Goals Identify Stakeholders Change Management Step 2: Reduce Total Cost of Ownership by Adopting a Cloud Approach Why Cloud Costs Less How Much Are the Savings? When Does Cloud Help? Step 3: Break Down Silos Unifying Data Access Choosing Storage Semantic Layer Step 4: Make Decisions in Context Faster Batch to Stream Contextual Information Cost Management Step 5: Leapfrog with Packaged AI Solutions Predictive Analytics Understanding and Generating Unstructured Data Personalization Packaged Solutions Step 6: Operationalize AI-Driven Workflows Identifying the Right Balance of Automation and Assistance Building a Data Culture Populating Your Data Science Team Step 7: Product Management for Data Applying Product Management Principles to Data 1. Understand and Maintain a Map of Data Flows in the Enterprise 2. Identify Key Metrics 3. Agreed Criteria, Committed Roadmap, and Visionary Backlog 4. Build for the Customers You Have 5. Don’t Shift the Burden of Change Management 6. Interview Customers to Discover Their Data Needs 7. Whiteboard and Prototype Extensively iv I Table of Contents 42 43 45 45 47 47 49 50 50 51 52 53 55 55 56 56 57 58 59 60 60 61 61 62 62 64 64 65 65 66 67 67 68 68
8. Build Only What Will Be Used Immediately 9. Standardize Common Entities and KPIs 10. Provide Self-Service Capabilities in Your Data Platform Summary 69 69 70 70 3. Designing for Your Data Team. 73 Classifying Data Processing Organizations Data Analysis-Driven Organization The Vision The Personas The Technological Framework Data Engineering-Driven Organization The Vision The Personas The Technological Framework Data Science-Driven Organization The Vision The Personas The Technological Framework Summary 73 76 77 78 80 82 82 84 86 89 89 91 92 94 4. A Migration Framework. 95 Modernize Data Workflows Holistic View Modernize Workflows Transform the Workflow Itself A Four-Step Migration Framework Prepare and Discover Assess and Plan Execute Optimize Estimating the Overall Cost of the Solution Audit of the Existing Infrastructure Request for Information/Proposal and Quotation Proof of Concept/Minimum Viable Product Setting Up Security and Data Governance Framework Artifacts Governance over the Life of the Data Schema, Pipeline, and Data Migration Schema Migration Pipeline Migration 95 95 96 98 98 99 100 103 104 105 105 106 107 108 108 110 111 113 113 113 Table of Contents | v
Data Migration Migration Stages Summary 5. Architecting a Data Lake. Data Lake and the Cloud—A Perfect Marriage Challenges with On-Premises Data Lakes Benefits of Cloud Data Lakes Design and Implementation Batch and Stream Data Catalog Hadoop Landscape Cloud Data Lake Reference Architecture Integrating the Data Lake: The Real Superpower APIs to Extend the Lake The Evolution of Data Lake with Apache Iceberg, Apache Hudi, and Delta Lake Interactive Analytics with Notebooks Democratizing Data Processing and Reporting Build Trust in the Data Data Ingestion Is Still an IT Matter ML in the Data Lake Training on Raw Data Predicting in the Data Lake Summary 116 121 122 125 125 125 126 127 127 129 130 131 136 136 136 138 140 141 143 145 145 146 146 6. Innovating with an Enterprise Data Warehouse. 149 A Modern Data Platform Organizational Goals Technological Challenges Technology Trends and Tools Hub-and-Spoke Architecture Data Ingest Business Intelligence Transformations Organizational Structure DWH to Enable Data Scientists Query Interface Storage API ML Without Moving Your Data Summary vi I Table of Contents 149 150 151 152 154 157 161 164 169 171 171 172 173 177
Converging to a Lakehouse. . 179 179 179 180 180 182 183 184 189 193 195 The Need for a Unique Architecture User Personas Antipattern: Disconnected Systems Antipattern: Duplicated Data Converged Architecture Two Forms Lakehouse on Cloud Storage SQL-First Lakehouse The Benefits of Convergence Summary Architectures for Streaming. . 197 The Value of Streaming Industry Use Cases Streaming Use Cases Streaming Ingest Streaming ETL Streaming ELT Streaming Insert Streaming from Edge Devices (loT) Streaming Sinks Real-Time Dashboards Live Querying Materialize Some Views Stream Analytics Time-Series Analytics Clickstream Analytics Anomaly Detection Resilient Streaming Continuous Intelligence Through ML Training Model on Streaming Data Streaming ML Inference Automated Actions Summary 197 198 199 200 200 202 203 204 205 205 206 206 207 207 208 210 211 212 212 215 215 216 9. Extending a Data Platform Using Hybrid and Edge. . 219 Why Multicloud? A Single Cloud Is Simpler and Cost-Effective Multicloud Is Inevitable Multicloud Could Be Strategic Multicloud Architectural Patterns 219 220 220 221 223 Table of Contents | vii
Single Pane of Glass Write Once, Run Anywhere Bursting from On Premises to Cloud Pass-Through from On Premises to Cloud Data Integration Through Streaming Adopting Multicloud Framework Time Scale Define a Target Multicloud Architecture Why Edge Computing? Bandwidth, Latency, and Patchy Connectivity Use Cases Benefits Challenges Edge Computing Architectural Patterns Smart Devices Smart Gateways ML Activation Adopting Edge Computing The Initial Context The Project The Final Outcomes and Next Steps Summary 223 224 225 226 227 229 229 231 231 233 233 235 236 237 237 238 238 239 241 241 241 244 245 10. Al Application Architecture. . 247 Is This an AI/ML Problem? Subfields of AI Generative AI Problems Fit for ML Buy, Adapt, or Build? Data Considerations When to Buy What Can You Buy? How Adapting Works AI Architectures Understanding Unstructured Data Generating Unstructured Data Predicting Outcomes Forecasting Values Anomaly Detection Personalization Automation viii I Table of Contents 248 248 249 253 254 254 255 256 258 260 261 263 265 266 268 269 271
272 273 274 275 276 Responsible AI AI Principles ML Fairness Explainability Summary Architecting an ML Platform. . 279 279 280 281 281 282 283 284 286 287 288 288 288 293 293 294 296 298 298 299 299 300 ML Activities Developing ML Models Labeling Environment Development Environment User Environment Preparing Data Training ML Models Deploying ML Models Deploying to an Endpoint Evaluate Model Hybrid and Multicloud Training-Serving Skew Automation Automate Training and Deployment Orchestration with Pipelines Continuous Evaluation and Training Choosing the ML Framework Team Skills Task Considerations User-Centric Summary Data Platform Modernization: A Model Case. . 303 New Technology for a New Era The Need for Change It Is Not Only a Matter of Technology The Beginning of the Journey The Current Environment The Target Environment The PoC Use Case The RFP Responses Proposed by Cloud Vendors The Target Environment The Approach on Migration The RFP Evaluation Process The Scope of the PoC 303 304 305 307 307 309 311 312 312 316 323 323 Table of Contents | ix
The Execution of the PoC The Final Decision Peroration Summary 324 325 326 326 Index. . 329 X I Table of Contents |
any_adam_object | 1 |
author | Tranquillin, Marco ca. 20./21. Jh Lakshmanan, Valliappa Tekiner, Firat ca. 20./21. Jh |
author_GND | (DE-588)1335109048 (DE-588)1222923750 (DE-588)1335111069 |
author_facet | Tranquillin, Marco ca. 20./21. Jh Lakshmanan, Valliappa Tekiner, Firat ca. 20./21. Jh |
author_role | aut aut aut |
author_sort | Tranquillin, Marco ca. 20./21. Jh |
author_variant | m t mt v l vl f t ft |
building | Verbundindex |
bvnumber | BV049723902 |
classification_rvk | ST 205 ST 530 |
ctrlnum | (OCoLC)1446257775 (DE-599)BVBBV049723902 |
discipline | Informatik |
edition | First edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a22000001c 4500</leader><controlfield tag="001">BV049723902</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20240930</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">240531s2023 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781098151614</subfield><subfield code="9">978-1-09-815161-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1446257775</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV049723902</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-898</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 205</subfield><subfield code="0">(DE-625)143613:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Tranquillin, Marco</subfield><subfield code="d">ca. 20./21. Jh.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1335109048</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Architecting data and machine learning platforms</subfield><subfield code="b">enable analytics and AI-driven innovation in the cloud</subfield><subfield code="c">Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Sebastopol, CA</subfield><subfield code="b">O'Reilly Media</subfield><subfield code="c">2023</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xviii, 338 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Hier auch später erschienene, unveränderte Nachdrucke</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Cloud Computing</subfield><subfield code="0">(DE-588)7623494-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenverarbeitung</subfield><subfield code="0">(DE-588)4011152-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Datenverarbeitung</subfield><subfield code="0">(DE-588)4011152-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Cloud Computing</subfield><subfield code="0">(DE-588)7623494-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lakshmanan, Valliappa</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1222923750</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tekiner, Firat</subfield><subfield code="d">ca. 20./21. Jh.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1335111069</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-09-815158-4</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=035066223&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-035066223</subfield></datafield></record></collection> |
id | DE-604.BV049723902 |
illustrated | Illustrated |
indexdate | 2024-11-11T09:07:02Z |
institution | BVB |
isbn | 9781098151614 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-035066223 |
oclc_num | 1446257775 |
open_access_boolean | |
owner | DE-739 DE-634 DE-898 DE-BY-UBR |
owner_facet | DE-739 DE-634 DE-898 DE-BY-UBR |
physical | xviii, 338 Seiten Illustrationen, Diagramme |
publishDate | 2023 |
publishDateSearch | 2023 |
publishDateSort | 2023 |
publisher | O'Reilly Media |
record_format | marc |
spelling | Tranquillin, Marco ca. 20./21. Jh. Verfasser (DE-588)1335109048 aut Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner First edition Sebastopol, CA O'Reilly Media 2023 xviii, 338 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Includes index Hier auch später erschienene, unveränderte Nachdrucke Cloud Computing (DE-588)7623494-0 gnd rswk-swf Datenverarbeitung (DE-588)4011152-0 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Datenverarbeitung (DE-588)4011152-0 s Cloud Computing (DE-588)7623494-0 s Maschinelles Lernen (DE-588)4193754-5 s DE-604 Lakshmanan, Valliappa Verfasser (DE-588)1222923750 aut Tekiner, Firat ca. 20./21. Jh. Verfasser (DE-588)1335111069 aut Erscheint auch als Online-Ausgabe 978-1-09-815158-4 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=035066223&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Tranquillin, Marco ca. 20./21. Jh Lakshmanan, Valliappa Tekiner, Firat ca. 20./21. Jh Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud Cloud Computing (DE-588)7623494-0 gnd Datenverarbeitung (DE-588)4011152-0 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)7623494-0 (DE-588)4011152-0 (DE-588)4193754-5 |
title | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud |
title_auth | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud |
title_exact_search | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud |
title_full | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner |
title_fullStr | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner |
title_full_unstemmed | Architecting data and machine learning platforms enable analytics and AI-driven innovation in the cloud Marco Tranquillin, Valliappa Lakshmanan and Firat Tekiner |
title_short | Architecting data and machine learning platforms |
title_sort | architecting data and machine learning platforms enable analytics and ai driven innovation in the cloud |
title_sub | enable analytics and AI-driven innovation in the cloud |
topic | Cloud Computing (DE-588)7623494-0 gnd Datenverarbeitung (DE-588)4011152-0 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | Cloud Computing Datenverarbeitung Maschinelles Lernen |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=035066223&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT tranquillinmarco architectingdataandmachinelearningplatformsenableanalyticsandaidriveninnovationinthecloud AT lakshmananvalliappa architectingdataandmachinelearningplatformsenableanalyticsandaidriveninnovationinthecloud AT tekinerfirat architectingdataandmachinelearningplatformsenableanalyticsandaidriveninnovationinthecloud |