Beginning Azure Synapse Analytics: Transition from Data Warehouse to Data Lakehouse
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Berkeley, CA
Apress L. P.
2021
|
Schlagworte: | |
Online-Zugang: | HWR01 |
Beschreibung: | 1 Online-Ressource (263 Seiten) |
ISBN: | 9781484270615 |
Internformat
MARC
LEADER | 00000nmm a2200000 c 4500 | ||
---|---|---|---|
001 | BV048523185 | ||
003 | DE-604 | ||
005 | 20230301 | ||
007 | cr|uuu---uuuuu | ||
008 | 221020s2021 |||| o||u| ||||||eng d | ||
020 | |a 9781484270615 |q (electronic bk.) |9 9781484270615 | ||
020 | |z 9781484270608 |9 9781484270608 | ||
035 | |a (ZDB-30-PQE)EBC6648095 | ||
035 | |a (ZDB-30-PAD)EBC6648095 | ||
035 | |a (ZDB-89-EBL)EBL6648095 | ||
035 | |a (OCoLC)1257400854 | ||
035 | |a (DE-599)BVBBV048523185 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-2070s | ||
100 | 1 | |a Shiyal, Bhadresh |e Verfasser |4 aut | |
245 | 1 | 0 | |a Beginning Azure Synapse Analytics |b Transition from Data Warehouse to Data Lakehouse |
264 | 1 | |a Berkeley, CA |b Apress L. P. |c 2021 | |
264 | 4 | |c ©2021 | |
300 | |a 1 Online-Ressource (263 Seiten) | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
505 | 8 | |a Intro -- Table of Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Core Data and Analytics Concepts -- Core Data Concepts -- What Is Data? -- Structured Data -- Semi-structured Data -- Unstructured Data -- Data Processing Methods -- Batch Data Processing -- Streaming or Real-Time Data Processing -- Relational Data and Its Characteristics -- Non-Relational Data and Its Characteristics -- Core Data Analytics Concepts -- What Is Data Analytics? -- Data Ingestion -- Data Exploration -- Data Processing -- ETL -- ELT -- ELT / ETL Tools -- Data Visualization -- Data Analytics Categories -- Descriptive Analytics -- Diagnostic Analytics -- Predictive Analytics -- Prescriptive Analytics -- Cognitive Analytics -- Summary -- Chapter 2: Modern Data Warehouses and Data Lakehouses -- What Is a Data Warehouse? -- Core Data Warehouse Concepts -- Data Model -- Model Types -- Schema Types -- Metadata -- Why Do We Need a Data Warehouse? -- Efficient Decision-Making -- Separation of Concerns -- Single Version of the Truth -- Data Restructuring -- Self-Service BI -- Historical Data -- Security -- Data Quality -- Data Mining -- More Revenues -- What Is a Modern Data Warehouse? -- Difference Between Traditional & -- Modern Data Warehouses -- Cloud vs. On-Premises -- Separation of Compute and Storage Resources -- Cost -- Scalability -- ETL vs. ELT -- Disaster Recovery -- Overall Architecture -- Data Lakehouse -- What Is a Data Lake? -- What Is Delta Lake? -- What Is Apache Spark? -- What Is a Data Lakehouse? -- Characteristics of a Data Lakehouse -- Various Data Types -- AI -- Decoupled Compute and Storage Resources -- Open Source Storage Format -- Data Analytics and BI Tools -- ACID Properties -- Differences Between a Data Warehouse and a Data Lakehouse -- Architecture -- Access to Raw Data | |
505 | 8 | |a Open Source vs. Proprietary -- Workloads -- Query Engines -- Data Processing -- Real-Time Data -- Examples of Data Lakehouses -- Azure Synapse Analytics -- Databricks -- Benefits of Data Lakehouse -- Support for All Types of Data -- Time to Market -- More Cost Effective -- AI -- Reduction in ETL/ELT Jobs -- Usage of Open Source Tools and Technologies -- Efficient and Easy Data Governance -- Drawbacks of Data Lakehouse -- Monolithic Architecture -- Technical Infancy -- Migration Cost -- Lack of Many Products/Options -- Scarcity of Skilled Technical Resources -- Summary -- Chapter 3: Introduction to Azure Synapse Analytics -- What Is Azure Synapse Analytics? -- Azure Synapse Analytics vs. Azure SQL Data Warehouse -- Why Should You Learn Azure Synapse Analytics? -- Main Features of Azure Synapse Analytics -- Unified Data Analytics Experience -- Powerful Data Insights -- Unlimited Scale -- Security, Privacy, and Compliance -- HTAP -- Key Service Capabilities of Azure Synapse Analytics -- Data Lake Exploration -- Multiple Language Support -- Deeply Integrated Apache Spark -- Serverless Synapse SQL Pool -- Hybrid Data Integration -- Power BI Integration -- AI Integration -- Enterprise Data Warehousing -- Seamless Streaming Analytics -- Workload Management -- Advanced Security -- Summary -- Chapter 4: Architecture and Its Main Components -- High-Level Architecture -- Main Components of Architecture -- Synapse SQL -- Compute Layer -- Dedicated Synapse SQL Pool -- Serverless Synapse SQL Pool -- Storage Layer -- Synapse Spark or Apache Spark -- Synapse Pipelines -- Synapse Studio -- Synapse Link -- Summary -- Chapter 5: Synapse SQL -- Synapse SQL Architecture Components -- Massively Parallel Processing Engine -- Distributed Query Processing Engine -- Control Node -- Compute Nodes -- Data Movement Service -- Distribution -- Hash Distribution | |
505 | 8 | |a Round-Robin Distribution -- Replication-based Distribution -- Azure Storage -- Dedicated or Provisioned Synapse SQL Pool -- Serverless or On-Demand Synapse SQL Pool -- Synapse SQL Feature Comparison -- Database Object Types -- Query Language -- Security -- Tools -- Storage Options -- Data Formats -- Resource Consumption Model for Synapse SQL -- Synapse SQL Best Practices -- Best Practices for Serverless Synapse SQL Pool -- Best Practices for Dedicated Synapse SQL Pool -- How-To's -- Create a Dedicated Synapse SQL Pool -- Create a Serverless or On-Demand Synapse SQL Pool -- Load Data Using COPY Statement in Dedicated Synapse SQL Pool -- Ingest Data into Azure Data Lake Storage Gen2 -- Summary -- Chapter 6: Synapse Spark -- What Is Apache Spark? -- What Is Synapse Spark in Azure Synapse Analytics? -- Synapse Spark Features & -- Capabilities -- Speed -- Faster Start Time -- Ease of Creation -- Ease of Use -- Security -- Automatic Scalability -- Separation of Concerns -- Multiple Language Support -- Integration with IDEs -- Pre-loaded Libraries -- REST APIs -- Delta Lake and Its Importance in Synapse Spark -- Synapse Spark Job Optimization -- Data Format -- Memory Management -- Data Serialization -- Data Caching -- Data Abstraction -- Join and Shuffle Optimization -- Bucketing -- Hyperspace Indexing -- Synapse Spark Machine Learning -- Data Preparation and Exploration -- Build Machine Learning Models -- Train Machine Learning Models -- Model Deployment and Scoring -- How-To's -- How to Create a Synapse Spark Pool -- How to Create and Submit Apache Spark Job Definition in Synapse Studio Using Python -- How to Monitor Synapse Spark Pools Using Synapse Studio -- Summary -- Chapter 7: Synapse Pipelines -- Overview of Azure Data Factory -- Overview of Synapse Pipelines -- Activities -- Pipelines -- Linked Services -- Dataset -- Integration Runtimes (IR) | |
505 | 8 | |a Azure Integration Runtime (Azure IR) -- Self-Hosted Integration Runtimes (SHIR) -- Azure SSIS Integration Runtimes (Azure SSIS IR) -- Control Flow -- Parameters -- Data Flow -- Data Movement Activities -- Category: Azure -- Category: Database -- Category: NoSQL -- Category: File -- Category: Generic -- Category: Services and Applications -- Data Transformation Activities -- Control Flow Activities -- Copy Pipeline Example -- Transformation Pipeline Example -- Pipeline Triggers -- Summary -- Chapter 8: Synapse Workspace and Studio -- What Is a Synapse Analytics Workspace? -- Synapse Analytics Workspace Components and Features -- Azure Data Lake Storage Gen2 Account and File System -- Serverless Synapse SQL Pool -- Shared Metadata Management -- Code Artifacts -- What Is Synapse Studio? -- Main Features of Synapse Studio -- Home Hub -- Data Hub -- Develop Hub -- Integrate Hub -- Monitor Hub -- Integration -- Activities -- Manage Hub -- Analytics Pools -- External Connections -- Integration -- Security -- Synapse Studio Capabilities -- Data Preparation -- Data Management -- Data Exploration -- Data Warehousing -- Data Visualization -- Machine Learning -- Power BI in Synapse Studio -- How-To's -- How to Create or Provision a New Azure Synapse Analytics Workspace Using Azure Portal -- How to Launch Azure Synapse Studio -- How to Link Power BI with Azure Synapse Studio -- Summary -- Chapter 9: Synapse Link -- OLTP vs. OLAP -- What Is HTAP? -- Benefits of HTAP -- No-ETL Analytics -- Instant Insights -- Reduced Data Duplication -- Simplified Technical Architecture -- What Is Azure Synapse Link? -- Azure Cosmos DB -- Azure Cosmos DB Analytical Store -- Columnar Storage -- Decoupling of Operational Store -- Automatic Data Synchronization -- SQL API and MongoDB API -- Analytical TTL -- Automatic Schema Updates -- Cost-Effective Archiving -- Scalability | |
505 | 8 | |a When to Use Azure Synapse Link for Cosmos DB -- Azure Synapse Link Limitations -- Azure Synapse Link Use Cases -- Industrial IOT -- Predictive Maintenance Pipeline -- Operational Reporting -- Real-Time Applications -- Real-Time Personalization for E-Commerce Users -- How-To's -- How to Enable Azure Synapse Link for Azure Cosmos DB -- How to Create an Azure Cosmos DB Container with Analytical Store Using Azure Portal -- How to Connect to Azure Synapse Link for Azure Cosmos DB Using Azure Portal -- Summary -- Chapter 10: Azure Synapse Analytics Use Cases and Reference Architecture -- Where Should You Use Azure Synapse Analytics? -- Large Volume of Data -- Disparate Sources of Data -- Data Transformation -- Batch or Streaming Data -- Where Should You Not Use Azure Synapse Analytics? -- Use Cases for Azure Synapse Analytics -- Financial Services -- Manufacturing -- Retail -- Healthcare -- Reference Architectures for Azure Synapse Analytics -- Modern Data Warehouse Architecture -- Real-Time Analytics on Big Data Architecture -- Summary -- Index | |
650 | 4 | |a Data warehousing-Management | |
650 | 4 | |a Microsoft Azure (Computing platform) | |
653 | 6 | |a Electronic books | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |a Shiyal, Bhadresh |t Beginning Azure Synapse Analytics |d Berkeley, CA : Apress L. P.,c2021 |z 9781484270608 |
912 | |a ZDB-30-PQE | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-033900033 | ||
966 | e | |u https://ebookcentral.proquest.com/lib/hwr/detail.action?docID=6648095 |l HWR01 |p ZDB-30-PQE |q HWR_PDA_PQE_Kauf |x Aggregator |3 Volltext |
Datensatz im Suchindex
_version_ | 1804184511269830656 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author | Shiyal, Bhadresh |
author_facet | Shiyal, Bhadresh |
author_role | aut |
author_sort | Shiyal, Bhadresh |
author_variant | b s bs |
building | Verbundindex |
bvnumber | BV048523185 |
collection | ZDB-30-PQE |
contents | Intro -- Table of Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Core Data and Analytics Concepts -- Core Data Concepts -- What Is Data? -- Structured Data -- Semi-structured Data -- Unstructured Data -- Data Processing Methods -- Batch Data Processing -- Streaming or Real-Time Data Processing -- Relational Data and Its Characteristics -- Non-Relational Data and Its Characteristics -- Core Data Analytics Concepts -- What Is Data Analytics? -- Data Ingestion -- Data Exploration -- Data Processing -- ETL -- ELT -- ELT / ETL Tools -- Data Visualization -- Data Analytics Categories -- Descriptive Analytics -- Diagnostic Analytics -- Predictive Analytics -- Prescriptive Analytics -- Cognitive Analytics -- Summary -- Chapter 2: Modern Data Warehouses and Data Lakehouses -- What Is a Data Warehouse? -- Core Data Warehouse Concepts -- Data Model -- Model Types -- Schema Types -- Metadata -- Why Do We Need a Data Warehouse? -- Efficient Decision-Making -- Separation of Concerns -- Single Version of the Truth -- Data Restructuring -- Self-Service BI -- Historical Data -- Security -- Data Quality -- Data Mining -- More Revenues -- What Is a Modern Data Warehouse? -- Difference Between Traditional & -- Modern Data Warehouses -- Cloud vs. On-Premises -- Separation of Compute and Storage Resources -- Cost -- Scalability -- ETL vs. ELT -- Disaster Recovery -- Overall Architecture -- Data Lakehouse -- What Is a Data Lake? -- What Is Delta Lake? -- What Is Apache Spark? -- What Is a Data Lakehouse? -- Characteristics of a Data Lakehouse -- Various Data Types -- AI -- Decoupled Compute and Storage Resources -- Open Source Storage Format -- Data Analytics and BI Tools -- ACID Properties -- Differences Between a Data Warehouse and a Data Lakehouse -- Architecture -- Access to Raw Data Open Source vs. Proprietary -- Workloads -- Query Engines -- Data Processing -- Real-Time Data -- Examples of Data Lakehouses -- Azure Synapse Analytics -- Databricks -- Benefits of Data Lakehouse -- Support for All Types of Data -- Time to Market -- More Cost Effective -- AI -- Reduction in ETL/ELT Jobs -- Usage of Open Source Tools and Technologies -- Efficient and Easy Data Governance -- Drawbacks of Data Lakehouse -- Monolithic Architecture -- Technical Infancy -- Migration Cost -- Lack of Many Products/Options -- Scarcity of Skilled Technical Resources -- Summary -- Chapter 3: Introduction to Azure Synapse Analytics -- What Is Azure Synapse Analytics? -- Azure Synapse Analytics vs. Azure SQL Data Warehouse -- Why Should You Learn Azure Synapse Analytics? -- Main Features of Azure Synapse Analytics -- Unified Data Analytics Experience -- Powerful Data Insights -- Unlimited Scale -- Security, Privacy, and Compliance -- HTAP -- Key Service Capabilities of Azure Synapse Analytics -- Data Lake Exploration -- Multiple Language Support -- Deeply Integrated Apache Spark -- Serverless Synapse SQL Pool -- Hybrid Data Integration -- Power BI Integration -- AI Integration -- Enterprise Data Warehousing -- Seamless Streaming Analytics -- Workload Management -- Advanced Security -- Summary -- Chapter 4: Architecture and Its Main Components -- High-Level Architecture -- Main Components of Architecture -- Synapse SQL -- Compute Layer -- Dedicated Synapse SQL Pool -- Serverless Synapse SQL Pool -- Storage Layer -- Synapse Spark or Apache Spark -- Synapse Pipelines -- Synapse Studio -- Synapse Link -- Summary -- Chapter 5: Synapse SQL -- Synapse SQL Architecture Components -- Massively Parallel Processing Engine -- Distributed Query Processing Engine -- Control Node -- Compute Nodes -- Data Movement Service -- Distribution -- Hash Distribution Round-Robin Distribution -- Replication-based Distribution -- Azure Storage -- Dedicated or Provisioned Synapse SQL Pool -- Serverless or On-Demand Synapse SQL Pool -- Synapse SQL Feature Comparison -- Database Object Types -- Query Language -- Security -- Tools -- Storage Options -- Data Formats -- Resource Consumption Model for Synapse SQL -- Synapse SQL Best Practices -- Best Practices for Serverless Synapse SQL Pool -- Best Practices for Dedicated Synapse SQL Pool -- How-To's -- Create a Dedicated Synapse SQL Pool -- Create a Serverless or On-Demand Synapse SQL Pool -- Load Data Using COPY Statement in Dedicated Synapse SQL Pool -- Ingest Data into Azure Data Lake Storage Gen2 -- Summary -- Chapter 6: Synapse Spark -- What Is Apache Spark? -- What Is Synapse Spark in Azure Synapse Analytics? -- Synapse Spark Features & -- Capabilities -- Speed -- Faster Start Time -- Ease of Creation -- Ease of Use -- Security -- Automatic Scalability -- Separation of Concerns -- Multiple Language Support -- Integration with IDEs -- Pre-loaded Libraries -- REST APIs -- Delta Lake and Its Importance in Synapse Spark -- Synapse Spark Job Optimization -- Data Format -- Memory Management -- Data Serialization -- Data Caching -- Data Abstraction -- Join and Shuffle Optimization -- Bucketing -- Hyperspace Indexing -- Synapse Spark Machine Learning -- Data Preparation and Exploration -- Build Machine Learning Models -- Train Machine Learning Models -- Model Deployment and Scoring -- How-To's -- How to Create a Synapse Spark Pool -- How to Create and Submit Apache Spark Job Definition in Synapse Studio Using Python -- How to Monitor Synapse Spark Pools Using Synapse Studio -- Summary -- Chapter 7: Synapse Pipelines -- Overview of Azure Data Factory -- Overview of Synapse Pipelines -- Activities -- Pipelines -- Linked Services -- Dataset -- Integration Runtimes (IR) Azure Integration Runtime (Azure IR) -- Self-Hosted Integration Runtimes (SHIR) -- Azure SSIS Integration Runtimes (Azure SSIS IR) -- Control Flow -- Parameters -- Data Flow -- Data Movement Activities -- Category: Azure -- Category: Database -- Category: NoSQL -- Category: File -- Category: Generic -- Category: Services and Applications -- Data Transformation Activities -- Control Flow Activities -- Copy Pipeline Example -- Transformation Pipeline Example -- Pipeline Triggers -- Summary -- Chapter 8: Synapse Workspace and Studio -- What Is a Synapse Analytics Workspace? -- Synapse Analytics Workspace Components and Features -- Azure Data Lake Storage Gen2 Account and File System -- Serverless Synapse SQL Pool -- Shared Metadata Management -- Code Artifacts -- What Is Synapse Studio? -- Main Features of Synapse Studio -- Home Hub -- Data Hub -- Develop Hub -- Integrate Hub -- Monitor Hub -- Integration -- Activities -- Manage Hub -- Analytics Pools -- External Connections -- Integration -- Security -- Synapse Studio Capabilities -- Data Preparation -- Data Management -- Data Exploration -- Data Warehousing -- Data Visualization -- Machine Learning -- Power BI in Synapse Studio -- How-To's -- How to Create or Provision a New Azure Synapse Analytics Workspace Using Azure Portal -- How to Launch Azure Synapse Studio -- How to Link Power BI with Azure Synapse Studio -- Summary -- Chapter 9: Synapse Link -- OLTP vs. OLAP -- What Is HTAP? -- Benefits of HTAP -- No-ETL Analytics -- Instant Insights -- Reduced Data Duplication -- Simplified Technical Architecture -- What Is Azure Synapse Link? -- Azure Cosmos DB -- Azure Cosmos DB Analytical Store -- Columnar Storage -- Decoupling of Operational Store -- Automatic Data Synchronization -- SQL API and MongoDB API -- Analytical TTL -- Automatic Schema Updates -- Cost-Effective Archiving -- Scalability When to Use Azure Synapse Link for Cosmos DB -- Azure Synapse Link Limitations -- Azure Synapse Link Use Cases -- Industrial IOT -- Predictive Maintenance Pipeline -- Operational Reporting -- Real-Time Applications -- Real-Time Personalization for E-Commerce Users -- How-To's -- How to Enable Azure Synapse Link for Azure Cosmos DB -- How to Create an Azure Cosmos DB Container with Analytical Store Using Azure Portal -- How to Connect to Azure Synapse Link for Azure Cosmos DB Using Azure Portal -- Summary -- Chapter 10: Azure Synapse Analytics Use Cases and Reference Architecture -- Where Should You Use Azure Synapse Analytics? -- Large Volume of Data -- Disparate Sources of Data -- Data Transformation -- Batch or Streaming Data -- Where Should You Not Use Azure Synapse Analytics? -- Use Cases for Azure Synapse Analytics -- Financial Services -- Manufacturing -- Retail -- Healthcare -- Reference Architectures for Azure Synapse Analytics -- Modern Data Warehouse Architecture -- Real-Time Analytics on Big Data Architecture -- Summary -- Index |
ctrlnum | (ZDB-30-PQE)EBC6648095 (ZDB-30-PAD)EBC6648095 (ZDB-89-EBL)EBL6648095 (OCoLC)1257400854 (DE-599)BVBBV048523185 |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>10136nmm a2200445 c 4500</leader><controlfield tag="001">BV048523185</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20230301 </controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">221020s2021 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781484270615</subfield><subfield code="q">(electronic bk.)</subfield><subfield code="9">9781484270615</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9781484270608</subfield><subfield code="9">9781484270608</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC6648095</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PAD)EBC6648095</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-89-EBL)EBL6648095</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1257400854</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV048523185</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-2070s</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Shiyal, Bhadresh</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Beginning Azure Synapse Analytics</subfield><subfield code="b">Transition from Data Warehouse to Data Lakehouse</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Berkeley, CA</subfield><subfield code="b">Apress L. P.</subfield><subfield code="c">2021</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">©2021</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (263 Seiten)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Intro -- Table of Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Core Data and Analytics Concepts -- Core Data Concepts -- What Is Data? -- Structured Data -- Semi-structured Data -- Unstructured Data -- Data Processing Methods -- Batch Data Processing -- Streaming or Real-Time Data Processing -- Relational Data and Its Characteristics -- Non-Relational Data and Its Characteristics -- Core Data Analytics Concepts -- What Is Data Analytics? -- Data Ingestion -- Data Exploration -- Data Processing -- ETL -- ELT -- ELT / ETL Tools -- Data Visualization -- Data Analytics Categories -- Descriptive Analytics -- Diagnostic Analytics -- Predictive Analytics -- Prescriptive Analytics -- Cognitive Analytics -- Summary -- Chapter 2: Modern Data Warehouses and Data Lakehouses -- What Is a Data Warehouse? -- Core Data Warehouse Concepts -- Data Model -- Model Types -- Schema Types -- Metadata -- Why Do We Need a Data Warehouse? -- Efficient Decision-Making -- Separation of Concerns -- Single Version of the Truth -- Data Restructuring -- Self-Service BI -- Historical Data -- Security -- Data Quality -- Data Mining -- More Revenues -- What Is a Modern Data Warehouse? -- Difference Between Traditional &amp -- Modern Data Warehouses -- Cloud vs. On-Premises -- Separation of Compute and Storage Resources -- Cost -- Scalability -- ETL vs. ELT -- Disaster Recovery -- Overall Architecture -- Data Lakehouse -- What Is a Data Lake? -- What Is Delta Lake? -- What Is Apache Spark? -- What Is a Data Lakehouse? -- Characteristics of a Data Lakehouse -- Various Data Types -- AI -- Decoupled Compute and Storage Resources -- Open Source Storage Format -- Data Analytics and BI Tools -- ACID Properties -- Differences Between a Data Warehouse and a Data Lakehouse -- Architecture -- Access to Raw Data</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Open Source vs. Proprietary -- Workloads -- Query Engines -- Data Processing -- Real-Time Data -- Examples of Data Lakehouses -- Azure Synapse Analytics -- Databricks -- Benefits of Data Lakehouse -- Support for All Types of Data -- Time to Market -- More Cost Effective -- AI -- Reduction in ETL/ELT Jobs -- Usage of Open Source Tools and Technologies -- Efficient and Easy Data Governance -- Drawbacks of Data Lakehouse -- Monolithic Architecture -- Technical Infancy -- Migration Cost -- Lack of Many Products/Options -- Scarcity of Skilled Technical Resources -- Summary -- Chapter 3: Introduction to Azure Synapse Analytics -- What Is Azure Synapse Analytics? -- Azure Synapse Analytics vs. Azure SQL Data Warehouse -- Why Should You Learn Azure Synapse Analytics? -- Main Features of Azure Synapse Analytics -- Unified Data Analytics Experience -- Powerful Data Insights -- Unlimited Scale -- Security, Privacy, and Compliance -- HTAP -- Key Service Capabilities of Azure Synapse Analytics -- Data Lake Exploration -- Multiple Language Support -- Deeply Integrated Apache Spark -- Serverless Synapse SQL Pool -- Hybrid Data Integration -- Power BI Integration -- AI Integration -- Enterprise Data Warehousing -- Seamless Streaming Analytics -- Workload Management -- Advanced Security -- Summary -- Chapter 4: Architecture and Its Main Components -- High-Level Architecture -- Main Components of Architecture -- Synapse SQL -- Compute Layer -- Dedicated Synapse SQL Pool -- Serverless Synapse SQL Pool -- Storage Layer -- Synapse Spark or Apache Spark -- Synapse Pipelines -- Synapse Studio -- Synapse Link -- Summary -- Chapter 5: Synapse SQL -- Synapse SQL Architecture Components -- Massively Parallel Processing Engine -- Distributed Query Processing Engine -- Control Node -- Compute Nodes -- Data Movement Service -- Distribution -- Hash Distribution</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Round-Robin Distribution -- Replication-based Distribution -- Azure Storage -- Dedicated or Provisioned Synapse SQL Pool -- Serverless or On-Demand Synapse SQL Pool -- Synapse SQL Feature Comparison -- Database Object Types -- Query Language -- Security -- Tools -- Storage Options -- Data Formats -- Resource Consumption Model for Synapse SQL -- Synapse SQL Best Practices -- Best Practices for Serverless Synapse SQL Pool -- Best Practices for Dedicated Synapse SQL Pool -- How-To's -- Create a Dedicated Synapse SQL Pool -- Create a Serverless or On-Demand Synapse SQL Pool -- Load Data Using COPY Statement in Dedicated Synapse SQL Pool -- Ingest Data into Azure Data Lake Storage Gen2 -- Summary -- Chapter 6: Synapse Spark -- What Is Apache Spark? -- What Is Synapse Spark in Azure Synapse Analytics? -- Synapse Spark Features &amp -- Capabilities -- Speed -- Faster Start Time -- Ease of Creation -- Ease of Use -- Security -- Automatic Scalability -- Separation of Concerns -- Multiple Language Support -- Integration with IDEs -- Pre-loaded Libraries -- REST APIs -- Delta Lake and Its Importance in Synapse Spark -- Synapse Spark Job Optimization -- Data Format -- Memory Management -- Data Serialization -- Data Caching -- Data Abstraction -- Join and Shuffle Optimization -- Bucketing -- Hyperspace Indexing -- Synapse Spark Machine Learning -- Data Preparation and Exploration -- Build Machine Learning Models -- Train Machine Learning Models -- Model Deployment and Scoring -- How-To's -- How to Create a Synapse Spark Pool -- How to Create and Submit Apache Spark Job Definition in Synapse Studio Using Python -- How to Monitor Synapse Spark Pools Using Synapse Studio -- Summary -- Chapter 7: Synapse Pipelines -- Overview of Azure Data Factory -- Overview of Synapse Pipelines -- Activities -- Pipelines -- Linked Services -- Dataset -- Integration Runtimes (IR)</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Azure Integration Runtime (Azure IR) -- Self-Hosted Integration Runtimes (SHIR) -- Azure SSIS Integration Runtimes (Azure SSIS IR) -- Control Flow -- Parameters -- Data Flow -- Data Movement Activities -- Category: Azure -- Category: Database -- Category: NoSQL -- Category: File -- Category: Generic -- Category: Services and Applications -- Data Transformation Activities -- Control Flow Activities -- Copy Pipeline Example -- Transformation Pipeline Example -- Pipeline Triggers -- Summary -- Chapter 8: Synapse Workspace and Studio -- What Is a Synapse Analytics Workspace? -- Synapse Analytics Workspace Components and Features -- Azure Data Lake Storage Gen2 Account and File System -- Serverless Synapse SQL Pool -- Shared Metadata Management -- Code Artifacts -- What Is Synapse Studio? -- Main Features of Synapse Studio -- Home Hub -- Data Hub -- Develop Hub -- Integrate Hub -- Monitor Hub -- Integration -- Activities -- Manage Hub -- Analytics Pools -- External Connections -- Integration -- Security -- Synapse Studio Capabilities -- Data Preparation -- Data Management -- Data Exploration -- Data Warehousing -- Data Visualization -- Machine Learning -- Power BI in Synapse Studio -- How-To's -- How to Create or Provision a New Azure Synapse Analytics Workspace Using Azure Portal -- How to Launch Azure Synapse Studio -- How to Link Power BI with Azure Synapse Studio -- Summary -- Chapter 9: Synapse Link -- OLTP vs. OLAP -- What Is HTAP? -- Benefits of HTAP -- No-ETL Analytics -- Instant Insights -- Reduced Data Duplication -- Simplified Technical Architecture -- What Is Azure Synapse Link? -- Azure Cosmos DB -- Azure Cosmos DB Analytical Store -- Columnar Storage -- Decoupling of Operational Store -- Automatic Data Synchronization -- SQL API and MongoDB API -- Analytical TTL -- Automatic Schema Updates -- Cost-Effective Archiving -- Scalability</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">When to Use Azure Synapse Link for Cosmos DB -- Azure Synapse Link Limitations -- Azure Synapse Link Use Cases -- Industrial IOT -- Predictive Maintenance Pipeline -- Operational Reporting -- Real-Time Applications -- Real-Time Personalization for E-Commerce Users -- How-To's -- How to Enable Azure Synapse Link for Azure Cosmos DB -- How to Create an Azure Cosmos DB Container with Analytical Store Using Azure Portal -- How to Connect to Azure Synapse Link for Azure Cosmos DB Using Azure Portal -- Summary -- Chapter 10: Azure Synapse Analytics Use Cases and Reference Architecture -- Where Should You Use Azure Synapse Analytics? -- Large Volume of Data -- Disparate Sources of Data -- Data Transformation -- Batch or Streaming Data -- Where Should You Not Use Azure Synapse Analytics? -- Use Cases for Azure Synapse Analytics -- Financial Services -- Manufacturing -- Retail -- Healthcare -- Reference Architectures for Azure Synapse Analytics -- Modern Data Warehouse Architecture -- Real-Time Analytics on Big Data Architecture -- Summary -- Index</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data warehousing-Management</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Microsoft Azure (Computing platform)</subfield></datafield><datafield tag="653" ind1=" " ind2="6"><subfield code="a">Electronic books</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="a">Shiyal, Bhadresh</subfield><subfield code="t">Beginning Azure Synapse Analytics</subfield><subfield code="d">Berkeley, CA : Apress L. P.,c2021</subfield><subfield code="z">9781484270608</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-033900033</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/hwr/detail.action?docID=6648095</subfield><subfield code="l">HWR01</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">HWR_PDA_PQE_Kauf</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV048523185 |
illustrated | Not Illustrated |
index_date | 2024-07-03T20:50:25Z |
indexdate | 2024-07-10T09:40:30Z |
institution | BVB |
isbn | 9781484270615 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-033900033 |
oclc_num | 1257400854 |
open_access_boolean | |
owner | DE-2070s |
owner_facet | DE-2070s |
physical | 1 Online-Ressource (263 Seiten) |
psigel | ZDB-30-PQE ZDB-30-PQE HWR_PDA_PQE_Kauf |
publishDate | 2021 |
publishDateSearch | 2021 |
publishDateSort | 2021 |
publisher | Apress L. P. |
record_format | marc |
spelling | Shiyal, Bhadresh Verfasser aut Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse Berkeley, CA Apress L. P. 2021 ©2021 1 Online-Ressource (263 Seiten) txt rdacontent c rdamedia cr rdacarrier Intro -- Table of Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Core Data and Analytics Concepts -- Core Data Concepts -- What Is Data? -- Structured Data -- Semi-structured Data -- Unstructured Data -- Data Processing Methods -- Batch Data Processing -- Streaming or Real-Time Data Processing -- Relational Data and Its Characteristics -- Non-Relational Data and Its Characteristics -- Core Data Analytics Concepts -- What Is Data Analytics? -- Data Ingestion -- Data Exploration -- Data Processing -- ETL -- ELT -- ELT / ETL Tools -- Data Visualization -- Data Analytics Categories -- Descriptive Analytics -- Diagnostic Analytics -- Predictive Analytics -- Prescriptive Analytics -- Cognitive Analytics -- Summary -- Chapter 2: Modern Data Warehouses and Data Lakehouses -- What Is a Data Warehouse? -- Core Data Warehouse Concepts -- Data Model -- Model Types -- Schema Types -- Metadata -- Why Do We Need a Data Warehouse? -- Efficient Decision-Making -- Separation of Concerns -- Single Version of the Truth -- Data Restructuring -- Self-Service BI -- Historical Data -- Security -- Data Quality -- Data Mining -- More Revenues -- What Is a Modern Data Warehouse? -- Difference Between Traditional & -- Modern Data Warehouses -- Cloud vs. On-Premises -- Separation of Compute and Storage Resources -- Cost -- Scalability -- ETL vs. ELT -- Disaster Recovery -- Overall Architecture -- Data Lakehouse -- What Is a Data Lake? -- What Is Delta Lake? -- What Is Apache Spark? -- What Is a Data Lakehouse? -- Characteristics of a Data Lakehouse -- Various Data Types -- AI -- Decoupled Compute and Storage Resources -- Open Source Storage Format -- Data Analytics and BI Tools -- ACID Properties -- Differences Between a Data Warehouse and a Data Lakehouse -- Architecture -- Access to Raw Data Open Source vs. Proprietary -- Workloads -- Query Engines -- Data Processing -- Real-Time Data -- Examples of Data Lakehouses -- Azure Synapse Analytics -- Databricks -- Benefits of Data Lakehouse -- Support for All Types of Data -- Time to Market -- More Cost Effective -- AI -- Reduction in ETL/ELT Jobs -- Usage of Open Source Tools and Technologies -- Efficient and Easy Data Governance -- Drawbacks of Data Lakehouse -- Monolithic Architecture -- Technical Infancy -- Migration Cost -- Lack of Many Products/Options -- Scarcity of Skilled Technical Resources -- Summary -- Chapter 3: Introduction to Azure Synapse Analytics -- What Is Azure Synapse Analytics? -- Azure Synapse Analytics vs. Azure SQL Data Warehouse -- Why Should You Learn Azure Synapse Analytics? -- Main Features of Azure Synapse Analytics -- Unified Data Analytics Experience -- Powerful Data Insights -- Unlimited Scale -- Security, Privacy, and Compliance -- HTAP -- Key Service Capabilities of Azure Synapse Analytics -- Data Lake Exploration -- Multiple Language Support -- Deeply Integrated Apache Spark -- Serverless Synapse SQL Pool -- Hybrid Data Integration -- Power BI Integration -- AI Integration -- Enterprise Data Warehousing -- Seamless Streaming Analytics -- Workload Management -- Advanced Security -- Summary -- Chapter 4: Architecture and Its Main Components -- High-Level Architecture -- Main Components of Architecture -- Synapse SQL -- Compute Layer -- Dedicated Synapse SQL Pool -- Serverless Synapse SQL Pool -- Storage Layer -- Synapse Spark or Apache Spark -- Synapse Pipelines -- Synapse Studio -- Synapse Link -- Summary -- Chapter 5: Synapse SQL -- Synapse SQL Architecture Components -- Massively Parallel Processing Engine -- Distributed Query Processing Engine -- Control Node -- Compute Nodes -- Data Movement Service -- Distribution -- Hash Distribution Round-Robin Distribution -- Replication-based Distribution -- Azure Storage -- Dedicated or Provisioned Synapse SQL Pool -- Serverless or On-Demand Synapse SQL Pool -- Synapse SQL Feature Comparison -- Database Object Types -- Query Language -- Security -- Tools -- Storage Options -- Data Formats -- Resource Consumption Model for Synapse SQL -- Synapse SQL Best Practices -- Best Practices for Serverless Synapse SQL Pool -- Best Practices for Dedicated Synapse SQL Pool -- How-To's -- Create a Dedicated Synapse SQL Pool -- Create a Serverless or On-Demand Synapse SQL Pool -- Load Data Using COPY Statement in Dedicated Synapse SQL Pool -- Ingest Data into Azure Data Lake Storage Gen2 -- Summary -- Chapter 6: Synapse Spark -- What Is Apache Spark? -- What Is Synapse Spark in Azure Synapse Analytics? -- Synapse Spark Features & -- Capabilities -- Speed -- Faster Start Time -- Ease of Creation -- Ease of Use -- Security -- Automatic Scalability -- Separation of Concerns -- Multiple Language Support -- Integration with IDEs -- Pre-loaded Libraries -- REST APIs -- Delta Lake and Its Importance in Synapse Spark -- Synapse Spark Job Optimization -- Data Format -- Memory Management -- Data Serialization -- Data Caching -- Data Abstraction -- Join and Shuffle Optimization -- Bucketing -- Hyperspace Indexing -- Synapse Spark Machine Learning -- Data Preparation and Exploration -- Build Machine Learning Models -- Train Machine Learning Models -- Model Deployment and Scoring -- How-To's -- How to Create a Synapse Spark Pool -- How to Create and Submit Apache Spark Job Definition in Synapse Studio Using Python -- How to Monitor Synapse Spark Pools Using Synapse Studio -- Summary -- Chapter 7: Synapse Pipelines -- Overview of Azure Data Factory -- Overview of Synapse Pipelines -- Activities -- Pipelines -- Linked Services -- Dataset -- Integration Runtimes (IR) Azure Integration Runtime (Azure IR) -- Self-Hosted Integration Runtimes (SHIR) -- Azure SSIS Integration Runtimes (Azure SSIS IR) -- Control Flow -- Parameters -- Data Flow -- Data Movement Activities -- Category: Azure -- Category: Database -- Category: NoSQL -- Category: File -- Category: Generic -- Category: Services and Applications -- Data Transformation Activities -- Control Flow Activities -- Copy Pipeline Example -- Transformation Pipeline Example -- Pipeline Triggers -- Summary -- Chapter 8: Synapse Workspace and Studio -- What Is a Synapse Analytics Workspace? -- Synapse Analytics Workspace Components and Features -- Azure Data Lake Storage Gen2 Account and File System -- Serverless Synapse SQL Pool -- Shared Metadata Management -- Code Artifacts -- What Is Synapse Studio? -- Main Features of Synapse Studio -- Home Hub -- Data Hub -- Develop Hub -- Integrate Hub -- Monitor Hub -- Integration -- Activities -- Manage Hub -- Analytics Pools -- External Connections -- Integration -- Security -- Synapse Studio Capabilities -- Data Preparation -- Data Management -- Data Exploration -- Data Warehousing -- Data Visualization -- Machine Learning -- Power BI in Synapse Studio -- How-To's -- How to Create or Provision a New Azure Synapse Analytics Workspace Using Azure Portal -- How to Launch Azure Synapse Studio -- How to Link Power BI with Azure Synapse Studio -- Summary -- Chapter 9: Synapse Link -- OLTP vs. OLAP -- What Is HTAP? -- Benefits of HTAP -- No-ETL Analytics -- Instant Insights -- Reduced Data Duplication -- Simplified Technical Architecture -- What Is Azure Synapse Link? -- Azure Cosmos DB -- Azure Cosmos DB Analytical Store -- Columnar Storage -- Decoupling of Operational Store -- Automatic Data Synchronization -- SQL API and MongoDB API -- Analytical TTL -- Automatic Schema Updates -- Cost-Effective Archiving -- Scalability When to Use Azure Synapse Link for Cosmos DB -- Azure Synapse Link Limitations -- Azure Synapse Link Use Cases -- Industrial IOT -- Predictive Maintenance Pipeline -- Operational Reporting -- Real-Time Applications -- Real-Time Personalization for E-Commerce Users -- How-To's -- How to Enable Azure Synapse Link for Azure Cosmos DB -- How to Create an Azure Cosmos DB Container with Analytical Store Using Azure Portal -- How to Connect to Azure Synapse Link for Azure Cosmos DB Using Azure Portal -- Summary -- Chapter 10: Azure Synapse Analytics Use Cases and Reference Architecture -- Where Should You Use Azure Synapse Analytics? -- Large Volume of Data -- Disparate Sources of Data -- Data Transformation -- Batch or Streaming Data -- Where Should You Not Use Azure Synapse Analytics? -- Use Cases for Azure Synapse Analytics -- Financial Services -- Manufacturing -- Retail -- Healthcare -- Reference Architectures for Azure Synapse Analytics -- Modern Data Warehouse Architecture -- Real-Time Analytics on Big Data Architecture -- Summary -- Index Data warehousing-Management Microsoft Azure (Computing platform) Electronic books Erscheint auch als Druck-Ausgabe Shiyal, Bhadresh Beginning Azure Synapse Analytics Berkeley, CA : Apress L. P.,c2021 9781484270608 |
spellingShingle | Shiyal, Bhadresh Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse Intro -- Table of Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Core Data and Analytics Concepts -- Core Data Concepts -- What Is Data? -- Structured Data -- Semi-structured Data -- Unstructured Data -- Data Processing Methods -- Batch Data Processing -- Streaming or Real-Time Data Processing -- Relational Data and Its Characteristics -- Non-Relational Data and Its Characteristics -- Core Data Analytics Concepts -- What Is Data Analytics? -- Data Ingestion -- Data Exploration -- Data Processing -- ETL -- ELT -- ELT / ETL Tools -- Data Visualization -- Data Analytics Categories -- Descriptive Analytics -- Diagnostic Analytics -- Predictive Analytics -- Prescriptive Analytics -- Cognitive Analytics -- Summary -- Chapter 2: Modern Data Warehouses and Data Lakehouses -- What Is a Data Warehouse? -- Core Data Warehouse Concepts -- Data Model -- Model Types -- Schema Types -- Metadata -- Why Do We Need a Data Warehouse? -- Efficient Decision-Making -- Separation of Concerns -- Single Version of the Truth -- Data Restructuring -- Self-Service BI -- Historical Data -- Security -- Data Quality -- Data Mining -- More Revenues -- What Is a Modern Data Warehouse? -- Difference Between Traditional & -- Modern Data Warehouses -- Cloud vs. On-Premises -- Separation of Compute and Storage Resources -- Cost -- Scalability -- ETL vs. ELT -- Disaster Recovery -- Overall Architecture -- Data Lakehouse -- What Is a Data Lake? -- What Is Delta Lake? -- What Is Apache Spark? -- What Is a Data Lakehouse? -- Characteristics of a Data Lakehouse -- Various Data Types -- AI -- Decoupled Compute and Storage Resources -- Open Source Storage Format -- Data Analytics and BI Tools -- ACID Properties -- Differences Between a Data Warehouse and a Data Lakehouse -- Architecture -- Access to Raw Data Open Source vs. Proprietary -- Workloads -- Query Engines -- Data Processing -- Real-Time Data -- Examples of Data Lakehouses -- Azure Synapse Analytics -- Databricks -- Benefits of Data Lakehouse -- Support for All Types of Data -- Time to Market -- More Cost Effective -- AI -- Reduction in ETL/ELT Jobs -- Usage of Open Source Tools and Technologies -- Efficient and Easy Data Governance -- Drawbacks of Data Lakehouse -- Monolithic Architecture -- Technical Infancy -- Migration Cost -- Lack of Many Products/Options -- Scarcity of Skilled Technical Resources -- Summary -- Chapter 3: Introduction to Azure Synapse Analytics -- What Is Azure Synapse Analytics? -- Azure Synapse Analytics vs. Azure SQL Data Warehouse -- Why Should You Learn Azure Synapse Analytics? -- Main Features of Azure Synapse Analytics -- Unified Data Analytics Experience -- Powerful Data Insights -- Unlimited Scale -- Security, Privacy, and Compliance -- HTAP -- Key Service Capabilities of Azure Synapse Analytics -- Data Lake Exploration -- Multiple Language Support -- Deeply Integrated Apache Spark -- Serverless Synapse SQL Pool -- Hybrid Data Integration -- Power BI Integration -- AI Integration -- Enterprise Data Warehousing -- Seamless Streaming Analytics -- Workload Management -- Advanced Security -- Summary -- Chapter 4: Architecture and Its Main Components -- High-Level Architecture -- Main Components of Architecture -- Synapse SQL -- Compute Layer -- Dedicated Synapse SQL Pool -- Serverless Synapse SQL Pool -- Storage Layer -- Synapse Spark or Apache Spark -- Synapse Pipelines -- Synapse Studio -- Synapse Link -- Summary -- Chapter 5: Synapse SQL -- Synapse SQL Architecture Components -- Massively Parallel Processing Engine -- Distributed Query Processing Engine -- Control Node -- Compute Nodes -- Data Movement Service -- Distribution -- Hash Distribution Round-Robin Distribution -- Replication-based Distribution -- Azure Storage -- Dedicated or Provisioned Synapse SQL Pool -- Serverless or On-Demand Synapse SQL Pool -- Synapse SQL Feature Comparison -- Database Object Types -- Query Language -- Security -- Tools -- Storage Options -- Data Formats -- Resource Consumption Model for Synapse SQL -- Synapse SQL Best Practices -- Best Practices for Serverless Synapse SQL Pool -- Best Practices for Dedicated Synapse SQL Pool -- How-To's -- Create a Dedicated Synapse SQL Pool -- Create a Serverless or On-Demand Synapse SQL Pool -- Load Data Using COPY Statement in Dedicated Synapse SQL Pool -- Ingest Data into Azure Data Lake Storage Gen2 -- Summary -- Chapter 6: Synapse Spark -- What Is Apache Spark? -- What Is Synapse Spark in Azure Synapse Analytics? -- Synapse Spark Features & -- Capabilities -- Speed -- Faster Start Time -- Ease of Creation -- Ease of Use -- Security -- Automatic Scalability -- Separation of Concerns -- Multiple Language Support -- Integration with IDEs -- Pre-loaded Libraries -- REST APIs -- Delta Lake and Its Importance in Synapse Spark -- Synapse Spark Job Optimization -- Data Format -- Memory Management -- Data Serialization -- Data Caching -- Data Abstraction -- Join and Shuffle Optimization -- Bucketing -- Hyperspace Indexing -- Synapse Spark Machine Learning -- Data Preparation and Exploration -- Build Machine Learning Models -- Train Machine Learning Models -- Model Deployment and Scoring -- How-To's -- How to Create a Synapse Spark Pool -- How to Create and Submit Apache Spark Job Definition in Synapse Studio Using Python -- How to Monitor Synapse Spark Pools Using Synapse Studio -- Summary -- Chapter 7: Synapse Pipelines -- Overview of Azure Data Factory -- Overview of Synapse Pipelines -- Activities -- Pipelines -- Linked Services -- Dataset -- Integration Runtimes (IR) Azure Integration Runtime (Azure IR) -- Self-Hosted Integration Runtimes (SHIR) -- Azure SSIS Integration Runtimes (Azure SSIS IR) -- Control Flow -- Parameters -- Data Flow -- Data Movement Activities -- Category: Azure -- Category: Database -- Category: NoSQL -- Category: File -- Category: Generic -- Category: Services and Applications -- Data Transformation Activities -- Control Flow Activities -- Copy Pipeline Example -- Transformation Pipeline Example -- Pipeline Triggers -- Summary -- Chapter 8: Synapse Workspace and Studio -- What Is a Synapse Analytics Workspace? -- Synapse Analytics Workspace Components and Features -- Azure Data Lake Storage Gen2 Account and File System -- Serverless Synapse SQL Pool -- Shared Metadata Management -- Code Artifacts -- What Is Synapse Studio? -- Main Features of Synapse Studio -- Home Hub -- Data Hub -- Develop Hub -- Integrate Hub -- Monitor Hub -- Integration -- Activities -- Manage Hub -- Analytics Pools -- External Connections -- Integration -- Security -- Synapse Studio Capabilities -- Data Preparation -- Data Management -- Data Exploration -- Data Warehousing -- Data Visualization -- Machine Learning -- Power BI in Synapse Studio -- How-To's -- How to Create or Provision a New Azure Synapse Analytics Workspace Using Azure Portal -- How to Launch Azure Synapse Studio -- How to Link Power BI with Azure Synapse Studio -- Summary -- Chapter 9: Synapse Link -- OLTP vs. OLAP -- What Is HTAP? -- Benefits of HTAP -- No-ETL Analytics -- Instant Insights -- Reduced Data Duplication -- Simplified Technical Architecture -- What Is Azure Synapse Link? -- Azure Cosmos DB -- Azure Cosmos DB Analytical Store -- Columnar Storage -- Decoupling of Operational Store -- Automatic Data Synchronization -- SQL API and MongoDB API -- Analytical TTL -- Automatic Schema Updates -- Cost-Effective Archiving -- Scalability When to Use Azure Synapse Link for Cosmos DB -- Azure Synapse Link Limitations -- Azure Synapse Link Use Cases -- Industrial IOT -- Predictive Maintenance Pipeline -- Operational Reporting -- Real-Time Applications -- Real-Time Personalization for E-Commerce Users -- How-To's -- How to Enable Azure Synapse Link for Azure Cosmos DB -- How to Create an Azure Cosmos DB Container with Analytical Store Using Azure Portal -- How to Connect to Azure Synapse Link for Azure Cosmos DB Using Azure Portal -- Summary -- Chapter 10: Azure Synapse Analytics Use Cases and Reference Architecture -- Where Should You Use Azure Synapse Analytics? -- Large Volume of Data -- Disparate Sources of Data -- Data Transformation -- Batch or Streaming Data -- Where Should You Not Use Azure Synapse Analytics? -- Use Cases for Azure Synapse Analytics -- Financial Services -- Manufacturing -- Retail -- Healthcare -- Reference Architectures for Azure Synapse Analytics -- Modern Data Warehouse Architecture -- Real-Time Analytics on Big Data Architecture -- Summary -- Index Data warehousing-Management Microsoft Azure (Computing platform) |
title | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_auth | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_exact_search | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_exact_search_txtP | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_full | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_fullStr | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_full_unstemmed | Beginning Azure Synapse Analytics Transition from Data Warehouse to Data Lakehouse |
title_short | Beginning Azure Synapse Analytics |
title_sort | beginning azure synapse analytics transition from data warehouse to data lakehouse |
title_sub | Transition from Data Warehouse to Data Lakehouse |
topic | Data warehousing-Management Microsoft Azure (Computing platform) |
topic_facet | Data warehousing-Management Microsoft Azure (Computing platform) |
work_keys_str_mv | AT shiyalbhadresh beginningazuresynapseanalyticstransitionfromdatawarehousetodatalakehouse |