Mining of massive datasets:
Gespeichert in:
Vorheriger Titel: | Rajaraman, Anand Mining of massive datasets |
---|---|
Hauptverfasser: | , , |
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cambridge
Cambridge Univ. Press
2014
|
Ausgabe: | 2. ed. |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Beschreibung: | 1. Aufl.: u.d.T.: Rajaraman, Anand: Mining of massive datasets. - Hier auch später erschienene, unveränderte Nachdrucke |
Beschreibung: | XI, 467 S. Ill., graph. Darst. |
ISBN: | 9781107077232 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV042002309 | ||
003 | DE-604 | ||
005 | 20190605 | ||
007 | t | ||
008 | 140730s2014 ad|| |||| 00||| eng d | ||
020 | |a 9781107077232 |c hbk. |9 978-1-107-07723-2 | ||
035 | |a (OCoLC)897210257 | ||
035 | |a (DE-599)BVBBV042002309 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-739 |a DE-29T |a DE-91G |a DE-898 |a DE-11 |a DE-19 |a DE-83 |a DE-573 |a DE-945 |a DE-523 |a DE-384 |a DE-521 | ||
082 | 0 | |a 006.312 |2 23 | |
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
084 | |a DAT 616f |2 stub | ||
084 | |a DAT 450f |2 stub | ||
100 | 1 | |a Leskovec, Jure |e Verfasser |0 (DE-588)1077529708 |4 aut | |
245 | 1 | 0 | |a Mining of massive datasets |c Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman |
250 | |a 2. ed. | ||
264 | 1 | |a Cambridge |b Cambridge Univ. Press |c 2014 | |
300 | |a XI, 467 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a 1. Aufl.: u.d.T.: Rajaraman, Anand: Mining of massive datasets. - Hier auch später erschienene, unveränderte Nachdrucke | ||
650 | 0 | 7 | |a Big Data |0 (DE-588)4802620-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Big Data |0 (DE-588)4802620-7 |D s |
689 | 0 | 1 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | |8 1\p |5 DE-604 | |
700 | 1 | |a Rajaraman, Anand |e Verfasser |0 (DE-588)104476662X |4 aut | |
700 | 1 | |a Ullman, Jeffrey D. |d 1942- |e Verfasser |0 (DE-588)123598230 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-139-92480-1 |
780 | 0 | 0 | |i Fortsetzung von |a Rajaraman, Anand |t Mining of massive datasets |d 2012 |z 978-1-107-01535-7 |w (DE-604)BV039744649 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
999 | |a oai:aleph.bib-bvb.de:BVB01-027444315 | ||
883 | 1 | |8 1\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk |
Datensatz im Suchindex
_version_ | 1804152413100179456 |
---|---|
adam_text | The Web, social media, mobile activity,
sensors, Internet commerce and so on all
provide many extremely large
datasets
from which information can be gleaned
by data mining. This book -focuses on
practical algorithms that have been used
to solve key problems in data mining and
can be used on even the largest
datasets.
ït
begins with a discussion of the
MapReduce framework, an important
tool for parallelizing algorithms
automatically. The tricks of locality-
sensitive hashing arG explained. This body
of knowledge, which deserves to be more
widely known, is essential when seeking
similar objects in a very large collection
without having to compare each pair of
objects. Stream processing algorithms
for mining data that arrives too fast for
exhaustive processing are also explained.
The PageRank idea and related tricks for
organizing the Web are covered next-
Other chapters cover the problems of
finding
frequent itemsets
and clustering,
each from the point of view that the
data is too large to fit in main memory,
and two applications: recommendation
systems and Web advertising, each vital
in e-commerce.
This second edition includes new
s
extended coverage on social networks,
machine learning and dimensionality
reduction. Written by leading authorities
in database and web technologies, it
is essential reading for students and
practitioners alike.
Contents
Preface
page
ix
Data Mining
1
1.1
What is Data Mining?
1
1.2
Statistical Limits on Data Mining
4
1.3
Things Useful to Know
7
1.4
Outline of the Book
15
1.5
Summary of Chapter
1 17
1.6
References for Chapter
1 17
MapReduce and the New Software Stack
19
2.1
Distributed File Systems
20
2.2
MapReduce
22
2.3
Algorithms Using MapReduce
28
2.4
Extensions to MapReduce
38
2.5
The Communication Cost Model
44
2.6
Complexity Theory for MapReduce
50
2.7
Summary of Chapter
2 64
2.8
References for Chapter
2 66
Finding Similar Items
68
3.1
Applications of Near-Neighbor Search
68
3.2
Shingling of Documents
72
3.3
Similarity-Preserving Summaries of Sets
75
3.4
Locality-Sensitive Hashing for Documents
82
3.5
Distance Measures
87
3.6
The Theory of Locality-Sensitive Functions
93
3.7
LSH Families for Other Distance Measures
98
3.8
Applications of Locality-Sensitive Hashing
104
3.9
Methods for High Degrees of Similarity 111
3.10
Summary of Chapter
3 119
3.11
References for Chapter
3 122
Mining Data Streams
123
VI
Contents
4.1
The Stream Data Model
4.2
Sampling Data in a Stream
4.3
Filtering Streams
4.4
Counting Distinct Elements in a Stream
4.5
Estimating Moments
4.6
Counting Ones in a Window
4.7
Decaying Windows
4.8
Summary of Chapter
4
4.9
References for Chapter
4
123
127
130
133
137
142
148
150
152
5
Link Analysis
154
5.1
PageRank
154
5.2
Efficient Computation of PageRank
168
5.3
Topic-Sensitive PageRank
174
5.4
Link Spam
178
5.5
Hubs and Authorities
182
5.6
Summary of Chapter
5 187
5.7
References for Chapter
5 190
6
Frequent Itemsets
191
6.1
The Market-Basket Model
191
6.2
Market Baskets and the
А
-Priori
Algorithm
198
6.3
Handling Larger
Datasets in
Main Memory
207
6.4
Limited-Pass Algorithms
214
6.5
Counting Frequent Items in a Stream
220
6.6
Summary of Chapter
6 224
6.7
References for Chapter
6 226
7
Clustering
228
7.1
Introduction to Clustering Techniques
228
7.2
Hierarchical Clustering
232
7.3
K-means Algorithms
241
7.4
The CURE Algorithm
249
7.5
Clustering in Non-Euclidean Spaces
252
7.6
Clustering for Streams and Parallelism
256
7.7
Summary of Chapter
7 262
7.8
References for Chapter
7 265
8
Advertising on the Web
267
8.1
Issues in On-Line Advertising
267
8.2
On-Line Algorithms
270
8.3
The Matching Problem
273
8.4
The Adwords Problem
276
8.5
Adwords Implementation
285
Contents
vii
8.6
Summary of Chapter
8 289
8.7
References for Chapter
8 290
9
Recommendation Systems
292
9.1
A Model for Recommendation Systems
292
9.2
Content-Based Recommendations
296
9.3
Collaborative Filtering
306
9.4
Dimensionality Reduction
312
9.5
The NetFlix Challenge
321
9.6
Summary of Chapter
9 322
9.7
References for Chapter
9 323
10
Mining Social-Network Graphs
325
10.1
Social Networks as Graphs
325
10.2
Clustering of Social-Network Graphs
330
10.3
Direct Discovery of Communities
338
10.4
Partitioning of Graphs
343
10.5
Finding Overlapping Communities
350
10.6
Simrank
357
10.7
Counting Triangles
361
10.8
Neighborhood Properties of Graphs
367
10.9
Summary of Chapter
10 378
10.10
References for Chapter
10 381
11
Dimensionality Reduction
384
11.1
Eigenvalues and Eigenvectors
384
11.2
Principal-Component Analysis
391
11.3
Singular-Value Decomposition
397
11.4
CUR Decomposition
406
11.5
Summary of Chapter
11 412
11.6
References for Chapter
11 414
12
Large-Scale Machine Learning
415
12.1
The Machine-Learning Model
416
12.2
Perceptrons
422
12.3
Support-Vector Machines
436
12.4
Learning from Nearest Neighbors
447
12.5
Comparison of Learning Methods
455
12.6
Summary of Chapter
12 456
12.7
References for Chapter
12 457
Index
459
|
any_adam_object | 1 |
author | Leskovec, Jure Rajaraman, Anand Ullman, Jeffrey D. 1942- |
author_GND | (DE-588)1077529708 (DE-588)104476662X (DE-588)123598230 |
author_facet | Leskovec, Jure Rajaraman, Anand Ullman, Jeffrey D. 1942- |
author_role | aut aut aut |
author_sort | Leskovec, Jure |
author_variant | j l jl a r ar j d u jd jdu |
building | Verbundindex |
bvnumber | BV042002309 |
classification_rvk | ST 530 |
classification_tum | DAT 616f DAT 450f |
ctrlnum | (OCoLC)897210257 (DE-599)BVBBV042002309 |
dewey-full | 006.312 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.312 |
dewey-search | 006.312 |
dewey-sort | 16.312 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | 2. ed. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02389nam a2200469 c 4500</leader><controlfield tag="001">BV042002309</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20190605 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">140730s2014 ad|| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781107077232</subfield><subfield code="c">hbk.</subfield><subfield code="9">978-1-107-07723-2</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)897210257</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV042002309</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-29T</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-898</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-945</subfield><subfield code="a">DE-523</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-521</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.312</subfield><subfield code="2">23</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 616f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 450f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Leskovec, Jure</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1077529708</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Mining of massive datasets</subfield><subfield code="c">Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">2. ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge</subfield><subfield code="b">Cambridge Univ. Press</subfield><subfield code="c">2014</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 467 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">1. Aufl.: u.d.T.: Rajaraman, Anand: Mining of massive datasets. - Hier auch später erschienene, unveränderte Nachdrucke</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="8">1\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Rajaraman, Anand</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)104476662X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ullman, Jeffrey D.</subfield><subfield code="d">1942-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)123598230</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-139-92480-1</subfield></datafield><datafield tag="780" ind1="0" ind2="0"><subfield code="i">Fortsetzung von</subfield><subfield code="a">Rajaraman, Anand</subfield><subfield code="t">Mining of massive datasets</subfield><subfield code="d">2012</subfield><subfield code="z">978-1-107-01535-7</subfield><subfield code="w">(DE-604)BV039744649</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027444315</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection> |
id | DE-604.BV042002309 |
illustrated | Illustrated |
indexdate | 2024-07-10T01:10:19Z |
institution | BVB |
isbn | 9781107077232 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027444315 |
oclc_num | 897210257 |
open_access_boolean | |
owner | DE-739 DE-29T DE-91G DE-BY-TUM DE-898 DE-BY-UBR DE-11 DE-19 DE-BY-UBM DE-83 DE-573 DE-945 DE-523 DE-384 DE-521 |
owner_facet | DE-739 DE-29T DE-91G DE-BY-TUM DE-898 DE-BY-UBR DE-11 DE-19 DE-BY-UBM DE-83 DE-573 DE-945 DE-523 DE-384 DE-521 |
physical | XI, 467 S. Ill., graph. Darst. |
publishDate | 2014 |
publishDateSearch | 2014 |
publishDateSort | 2014 |
publisher | Cambridge Univ. Press |
record_format | marc |
spelling | Leskovec, Jure Verfasser (DE-588)1077529708 aut Mining of massive datasets Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman 2. ed. Cambridge Cambridge Univ. Press 2014 XI, 467 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier 1. Aufl.: u.d.T.: Rajaraman, Anand: Mining of massive datasets. - Hier auch später erschienene, unveränderte Nachdrucke Big Data (DE-588)4802620-7 gnd rswk-swf Data Mining (DE-588)4428654-5 gnd rswk-swf Big Data (DE-588)4802620-7 s Data Mining (DE-588)4428654-5 s 1\p DE-604 Rajaraman, Anand Verfasser (DE-588)104476662X aut Ullman, Jeffrey D. 1942- Verfasser (DE-588)123598230 aut Erscheint auch als Online-Ausgabe 978-1-139-92480-1 Fortsetzung von Rajaraman, Anand Mining of massive datasets 2012 978-1-107-01535-7 (DE-604)BV039744649 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk |
spellingShingle | Leskovec, Jure Rajaraman, Anand Ullman, Jeffrey D. 1942- Mining of massive datasets Big Data (DE-588)4802620-7 gnd Data Mining (DE-588)4428654-5 gnd |
subject_GND | (DE-588)4802620-7 (DE-588)4428654-5 |
title | Mining of massive datasets |
title_auth | Mining of massive datasets |
title_exact_search | Mining of massive datasets |
title_full | Mining of massive datasets Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman |
title_fullStr | Mining of massive datasets Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman |
title_full_unstemmed | Mining of massive datasets Jure Leskovec ; Anand Rajaraman ; Jeffrey David Ullman |
title_old | Rajaraman, Anand Mining of massive datasets |
title_short | Mining of massive datasets |
title_sort | mining of massive datasets |
topic | Big Data (DE-588)4802620-7 gnd Data Mining (DE-588)4428654-5 gnd |
topic_facet | Big Data Data Mining |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027444315&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT leskovecjure miningofmassivedatasets AT rajaramananand miningofmassivedatasets AT ullmanjeffreyd miningofmassivedatasets |