Learning Apache Drill: query and analyze distributed data sources with SQL
Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such a...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Beijing
O'Reilly
October 2018
|
Ausgabe: | first edition |
Schlagworte: | |
Zusammenfassung: | Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you'll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis ; Query file types including logfiles, Parquet, JSON, and other complex formats ; Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL ; Connect to Drill programmatically using a variety of languages ; Use Drill even with challenging or ambiguous file formats ; Perform sophisticated analysis by extending Drill's functionality with user-defined functions ; Facilitate data analysis for network security, image metadata, and machine learning |
Beschreibung: | XVI, 311 Seiten Illustrationen, Diagramme |
ISBN: | 9781492032793 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV045448145 | ||
003 | DE-604 | ||
005 | 20190418 | ||
007 | t | ||
008 | 190206s2018 a||| |||| 00||| eng d | ||
015 | |a GBB8L9800 |2 dnb | ||
020 | |a 9781492032793 |c pbk |9 978-1-492-03279-3 | ||
035 | |a (OCoLC)1089711425 | ||
035 | |a (DE-599)BVBBV045448145 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-29T | ||
100 | 1 | |a Givre, Charles |e Verfasser |4 aut | |
245 | 1 | 0 | |a Learning Apache Drill |b query and analyze distributed data sources with SQL |c Charles Givre and Paul Rogers |
250 | |a first edition | ||
264 | 1 | |a Beijing |b O'Reilly |c October 2018 | |
264 | 4 | |c © 2019 | |
300 | |a XVI, 311 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
520 | |a Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you'll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis ; Query file types including logfiles, Parquet, JSON, and other complex formats ; Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL ; Connect to Drill programmatically using a variety of languages ; Use Drill even with challenging or ambiguous file formats ; Perform sophisticated analysis by extending Drill's functionality with user-defined functions ; Facilitate data analysis for network security, image metadata, and machine learning | ||
650 | 4 | |a Apache Hadoop | |
650 | 4 | |a Apache Drill | |
650 | 7 | |a Apache Hadoop |2 fast | |
650 | 4 | |a File organization (Computer science) | |
650 | 4 | |a Querying (Computer science) | |
650 | 4 | |a SQL (Computer program language) | |
650 | 4 | |a Big data | |
650 | 7 | |a Big data |2 fast | |
650 | 7 | |a File organization (Computer science) |2 fast | |
650 | 7 | |a Querying (Computer science) |2 fast | |
650 | 7 | |a SQL (Computer program language) |2 fast | |
700 | 1 | |a Rogers, Paul |e Verfasser |4 aut | |
999 | |a oai:aleph.bib-bvb.de:BVB01-030833579 |
Datensatz im Suchindex
_version_ | 1804179343709044736 |
---|---|
any_adam_object | |
author | Givre, Charles Rogers, Paul |
author_facet | Givre, Charles Rogers, Paul |
author_role | aut aut |
author_sort | Givre, Charles |
author_variant | c g cg p r pr |
building | Verbundindex |
bvnumber | BV045448145 |
ctrlnum | (OCoLC)1089711425 (DE-599)BVBBV045448145 |
edition | first edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02707nam a2200445 c 4500</leader><controlfield tag="001">BV045448145</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20190418 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">190206s2018 a||| |||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">GBB8L9800</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781492032793</subfield><subfield code="c">pbk</subfield><subfield code="9">978-1-492-03279-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1089711425</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045448145</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Givre, Charles</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Learning Apache Drill</subfield><subfield code="b">query and analyze distributed data sources with SQL</subfield><subfield code="c">Charles Givre and Paul Rogers</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">first edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Beijing</subfield><subfield code="b">O'Reilly</subfield><subfield code="c">October 2018</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2019</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XVI, 311 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you'll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis ; Query file types including logfiles, Parquet, JSON, and other complex formats ; Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL ; Connect to Drill programmatically using a variety of languages ; Use Drill even with challenging or ambiguous file formats ; Perform sophisticated analysis by extending Drill's functionality with user-defined functions ; Facilitate data analysis for network security, image metadata, and machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apache Hadoop</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apache Drill</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Apache Hadoop</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">File organization (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Querying (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">SQL (Computer program language)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Big data</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Big data</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">File organization (Computer science)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Querying (Computer science)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">SQL (Computer program language)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Rogers, Paul</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030833579</subfield></datafield></record></collection> |
id | DE-604.BV045448145 |
illustrated | Illustrated |
indexdate | 2024-07-10T08:18:21Z |
institution | BVB |
isbn | 9781492032793 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030833579 |
oclc_num | 1089711425 |
open_access_boolean | |
owner | DE-29T |
owner_facet | DE-29T |
physical | XVI, 311 Seiten Illustrationen, Diagramme |
publishDate | 2018 |
publishDateSearch | 2018 |
publishDateSort | 2018 |
publisher | O'Reilly |
record_format | marc |
spelling | Givre, Charles Verfasser aut Learning Apache Drill query and analyze distributed data sources with SQL Charles Givre and Paul Rogers first edition Beijing O'Reilly October 2018 © 2019 XVI, 311 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you'll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis ; Query file types including logfiles, Parquet, JSON, and other complex formats ; Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL ; Connect to Drill programmatically using a variety of languages ; Use Drill even with challenging or ambiguous file formats ; Perform sophisticated analysis by extending Drill's functionality with user-defined functions ; Facilitate data analysis for network security, image metadata, and machine learning Apache Hadoop Apache Drill Apache Hadoop fast File organization (Computer science) Querying (Computer science) SQL (Computer program language) Big data Big data fast File organization (Computer science) fast Querying (Computer science) fast SQL (Computer program language) fast Rogers, Paul Verfasser aut |
spellingShingle | Givre, Charles Rogers, Paul Learning Apache Drill query and analyze distributed data sources with SQL Apache Hadoop Apache Drill Apache Hadoop fast File organization (Computer science) Querying (Computer science) SQL (Computer program language) Big data Big data fast File organization (Computer science) fast Querying (Computer science) fast SQL (Computer program language) fast |
title | Learning Apache Drill query and analyze distributed data sources with SQL |
title_auth | Learning Apache Drill query and analyze distributed data sources with SQL |
title_exact_search | Learning Apache Drill query and analyze distributed data sources with SQL |
title_full | Learning Apache Drill query and analyze distributed data sources with SQL Charles Givre and Paul Rogers |
title_fullStr | Learning Apache Drill query and analyze distributed data sources with SQL Charles Givre and Paul Rogers |
title_full_unstemmed | Learning Apache Drill query and analyze distributed data sources with SQL Charles Givre and Paul Rogers |
title_short | Learning Apache Drill |
title_sort | learning apache drill query and analyze distributed data sources with sql |
title_sub | query and analyze distributed data sources with SQL |
topic | Apache Hadoop Apache Drill Apache Hadoop fast File organization (Computer science) Querying (Computer science) SQL (Computer program language) Big data Big data fast File organization (Computer science) fast Querying (Computer science) fast SQL (Computer program language) fast |
topic_facet | Apache Hadoop Apache Drill File organization (Computer science) Querying (Computer science) SQL (Computer program language) Big data |
work_keys_str_mv | AT givrecharles learningapachedrillqueryandanalyzedistributeddatasourceswithsql AT rogerspaul learningapachedrillqueryandanalyzedistributeddatasourceswithsql |