The enterprise big data lake: delivering the promise of big data and data science

Intro -- Copyright -- Table of Contents -- Preface -- Who Should Read This Book? -- Conventions Used in This Book -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chapter 1. Introduction to Data Lakes -- Data Lake Maturity -- Data Puddles -- Data Ponds -- Creating a Succes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Gorelik, Alex (VerfasserIn)
Format: Elektronisch E-Book
Sprache:English
Veröffentlicht: Beijing ; Boston ; Farnham ; Sebastopol ; Tokyo O'Reilly 2019
Ausgabe:First edition
Schlagworte:
Online-Zugang:UBY01
Zusammenfassung:Intro -- Copyright -- Table of Contents -- Preface -- Who Should Read This Book? -- Conventions Used in This Book -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chapter 1. Introduction to Data Lakes -- Data Lake Maturity -- Data Puddles -- Data Ponds -- Creating a Successful Data Lake -- The Right Platform -- The Right Data -- The Right Interface -- The Data Swamp -- Roadmap to Data Lake Success -- Standing Up a Data Lake -- Organizing the Data Lake -- Setting Up the Data Lake for Self-Service -- Data Lake Architectures -- Data Lakes in the Public Cloud -- Logical Data Lakes -- Conclusion -- Chapter 2. Historical Perspective -- The Drive for Self-Service Data-The Birth of Databases -- The Analytics Imperative-The Birth of Data Warehousing -- The Data Warehouse Ecosystem -- Storing and Querying the Data -- Loading the Data-Data Integration Tools -- Organizing and Managing the Data -- Consuming the Data -- Conclusion -- Chapter 3. Introduction to Big Data and Data Science -- Hadoop Leads the Historic Shift to Big Data -- The Hadoop File System -- How Processing and Storage Interact in a MapReduce Job -- Schema on Read -- Hadoop Projects -- Data Science -- What Should Your Analytics Organization Focus On? -- Machine Learning -- Explainability -- Change Management -- Conclusion -- Chapter 4. Starting a Data Lake -- The What and Why of Hadoop -- Preventing Proliferation of Data Puddles -- Taking Advantage of Big Data -- Leading with Data Science -- Strategy 1: Offload Existing Functionality -- Strategy 2: Data Lakes for New Projects -- Strategy 3: Establish a Central Point of Governance -- Which Way Is Right for You? -- Conclusion -- Chapter 5. From Data Ponds/Big Data Warehouses to Data Lakes -- Essential Functions of a Data Warehouse -- Dimensional Modeling for Analytics -- Integrating Data from Disparate Sources
Beschreibung:1 Online-Ressource (xii, 205 Seiten) Illustrationen
ISBN:9781491931523

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand!