Principles of big data: preparing, sharing, and analyzing complex information
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Amsterdam [u.a.]
Morgan Kaufmann
2013
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXVI, 261 S. Ill., graph. Darst. |
ISBN: | 9780124045767 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV041086271 | ||
003 | DE-604 | ||
005 | 20180921 | ||
007 | t | ||
008 | 130613s2013 ad|| |||| 00||| eng d | ||
010 | |a 2013006421 | ||
020 | |a 9780124045767 |9 978-0-12-404576-7 | ||
035 | |a (OCoLC)856804539 | ||
035 | |a (DE-599)GBV744167833 | ||
040 | |a DE-604 |b ger |e aacr | ||
041 | 0 | |a eng | |
049 | |a DE-473 |a DE-91G |a DE-83 |a DE-573 |a DE-739 |a DE-11 |a DE-M382 |a DE-92 | ||
084 | |a ST 265 |0 (DE-625)143634: |2 rvk | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
084 | |a DAT 620f |2 stub | ||
100 | 1 | |a Berman, Jules J. |d 1950- |e Verfasser |0 (DE-588)1043067620 |4 aut | |
245 | 1 | 0 | |a Principles of big data |b preparing, sharing, and analyzing complex information |c Jules J. Berman |
264 | 1 | |a Amsterdam [u.a.] |b Morgan Kaufmann |c 2013 | |
300 | |a XXVI, 261 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Datenaufbereitung |0 (DE-588)4148865-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenmanagement |0 (DE-588)4213132-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Big Data |0 (DE-588)4802620-7 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Big Data |0 (DE-588)4802620-7 |D s |
689 | 0 | 1 | |a Datenmanagement |0 (DE-588)4213132-7 |D s |
689 | 0 | 2 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 3 | |a Datenaufbereitung |0 (DE-588)4148865-9 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026062959&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-026062959 |
Datensatz im Suchindex
_version_ | 1804150460198682624 |
---|---|
adam_text | Contents
Acknowledgments
xi
Author Biography
xiii
Preface
xv
Introduction
xix
1.
Providing Structure to Unstructured
Data
Background
1
Machine Translation
2
Autocoding
4
Indexing
9
Term Extraction
11
2.
Identification, Deidentification,
and
Reidentification
Background
15
Features of an Identifier System
17
Registered Unique Object Identifiers
18
Really Bad Identifier Methods
22
Embedding Information in an Identifier: Not
Recommended
24
One-Way Hashes
25
Use Case: Hospital Registration
26
Deidentification
28
Data Scrubbing
30
Reidentification
31
Lessons Learned
32
3.
Ontologies and Semantics
Background
35
Classifications, the Simplest of Ontologies
36
Ontologies, Classes with Multiple Parents
39
Choosing a Class Model
40
Introduction to Resource Description Framework
Schema
44
Common Pitfalls in Ontology Development
46
4·
Introspection
Background
49
Knowledge of Self
50
eXtensible Markup Language
52
Introduction to Meaning
54
Namespaces and the Aggregation of Meaningful
Assertions
55
Resource Description Framework Triples
56
Reflection
59
Use Case: Trusted Time Stamp
59
Summary
60
5.
Data Integration and Software
Interoperability
Background
63
The Committee to Survey Standards
64
Standard Trajectory
65
Specifications and Standards
69
Versioning
71
Compliance Issues
73
Interfaces to Big Data Resources
74
6.
Immutability and Immortality
Background
77
Immutability and Identifiers
78
Data Objects
80
Legacy Data
82
Data Born from Data
83
Reconciling Identifiers across Institutions
84
Zero-Knowledge Reconciliation
86
The Curator s Burden
87
7.
Measurement
Background
89
Counting
90
Gene Counting
93
vu
Vlil
CONTENTS
Dealing with Negations
93
Understanding Your Control
95
Practical Significance of Measurements
96
Obsessive-Compulsive Disorder: The Mark of a Great
Data Manager
97
8.
Simple but Powerful Big Data Techniques
Background
99
Look at the Data
100
Data Range
110
Denominator
112
Frequency Distributions
115
Mean and Standard Deviation
119
Estimation-Only Analyses
122
Use Case: Watching Data Trends with Google
Ngrams
123
Use Case: Estimating Movie Preferences
126
9.
Analysis
Background
129
Analytic Tasks
130
Clustering, Classifying, Recommending, and
Modeling
130
Data Reduction
134
Normalizing and Adjusting Data
137
Big Data Software: Speed and Scalability
139
Find Relationships, Not Similarities
141
10.
Special Considerations in Big Data
Analysis
Background
145
Theory in Search of Data
146
Data in Search of a Theory
146
Overfitting
148
Bigness Bias
148
Too Much Data
151
Fixing Data
152
Data Subsets in Big Data: Neither Additive nor
Transitive
153
Additional Big Data Pitfalls
154
11.
Stepwise Approach to Big Data
Analysis
Background
157
Step
1.
A Question Is Formulated
158
Step
2.
Resource Evaluation
158
Step
3.
A Question Is Reformulated
159
Step
4.
Query Output Adequacy
160
Step
5.
Data Description
161
Step
6.
Data Reduction
161
Step
7.
Algorithms Are Selected, If Absolutely
Necessary
162
Step
8.
Results Are Reviewed and Conclusions
Are Asserted
164
Step
9.
Conclusions Are Examined and Subjected
to Validation
164
12.
Failure
Background
167
Failure Is Common
168
Failed Standards
169
Complexity
172
When Does Complexity Help?
173
When Redundancy Fails
174
Save Money; Don t Protect Harmless
Information
176
After Failure
177
Use Case: Cancer
Biomedical
Informatics Grid,
a Bridge Too Far
178
13.
Legalities
Background
183
Responsibility for the Accuracy and Legitimacy of
Contained Data
184
Rights to Create, Use, and Share the Resource
185
Copyright and Patent Infringements Incurred by
Using Standards
187
Protections for Individuals
188
Consent
190
Unconsented Data
194
Good Policies Are a Good Policy
197
Use Case: The Havasupai Story
198
H. Societal Issues
Background
201
How Big Data Is Perceived
201
The Necessity of Data Sharing, Even When It
Seems Irrelevant
204
Reducing Costs and Increasing Productivity with
Big Data
208
CONTENTS
IX
Public
Mistrust
210
Glossary
229
Saving Us from Ourselves
211
References
247
Hubris and Hyperbole
213
Index
257
15.
The Future
Background
217
Last Words
226
|
any_adam_object | 1 |
author | Berman, Jules J. 1950- |
author_GND | (DE-588)1043067620 |
author_facet | Berman, Jules J. 1950- |
author_role | aut |
author_sort | Berman, Jules J. 1950- |
author_variant | j j b jj jjb |
building | Verbundindex |
bvnumber | BV041086271 |
classification_rvk | ST 265 ST 530 |
classification_tum | DAT 620f |
ctrlnum | (OCoLC)856804539 (DE-599)GBV744167833 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01740nam a2200421 c 4500</leader><controlfield tag="001">BV041086271</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20180921 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">130613s2013 ad|| |||| 00||| eng d</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">2013006421</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780124045767</subfield><subfield code="9">978-0-12-404576-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)856804539</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBV744167833</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-473</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-M382</subfield><subfield code="a">DE-92</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 265</subfield><subfield code="0">(DE-625)143634:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 620f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Berman, Jules J.</subfield><subfield code="d">1950-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1043067620</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Principles of big data</subfield><subfield code="b">preparing, sharing, and analyzing complex information</subfield><subfield code="c">Jules J. Berman</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u.a.]</subfield><subfield code="b">Morgan Kaufmann</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXVI, 261 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenaufbereitung</subfield><subfield code="0">(DE-588)4148865-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenmanagement</subfield><subfield code="0">(DE-588)4213132-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Big Data</subfield><subfield code="0">(DE-588)4802620-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenmanagement</subfield><subfield code="0">(DE-588)4213132-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Datenaufbereitung</subfield><subfield code="0">(DE-588)4148865-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026062959&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-026062959</subfield></datafield></record></collection> |
id | DE-604.BV041086271 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:39:16Z |
institution | BVB |
isbn | 9780124045767 |
language | English |
lccn | 2013006421 |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-026062959 |
oclc_num | 856804539 |
open_access_boolean | |
owner | DE-473 DE-BY-UBG DE-91G DE-BY-TUM DE-83 DE-573 DE-739 DE-11 DE-M382 DE-92 |
owner_facet | DE-473 DE-BY-UBG DE-91G DE-BY-TUM DE-83 DE-573 DE-739 DE-11 DE-M382 DE-92 |
physical | XXVI, 261 S. Ill., graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Morgan Kaufmann |
record_format | marc |
spelling | Berman, Jules J. 1950- Verfasser (DE-588)1043067620 aut Principles of big data preparing, sharing, and analyzing complex information Jules J. Berman Amsterdam [u.a.] Morgan Kaufmann 2013 XXVI, 261 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Datenaufbereitung (DE-588)4148865-9 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Datenmanagement (DE-588)4213132-7 gnd rswk-swf Big Data (DE-588)4802620-7 gnd rswk-swf Big Data (DE-588)4802620-7 s Datenmanagement (DE-588)4213132-7 s Datenanalyse (DE-588)4123037-1 s Datenaufbereitung (DE-588)4148865-9 s DE-604 Digitalisierung UB Bamberg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026062959&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Berman, Jules J. 1950- Principles of big data preparing, sharing, and analyzing complex information Datenaufbereitung (DE-588)4148865-9 gnd Datenanalyse (DE-588)4123037-1 gnd Datenmanagement (DE-588)4213132-7 gnd Big Data (DE-588)4802620-7 gnd |
subject_GND | (DE-588)4148865-9 (DE-588)4123037-1 (DE-588)4213132-7 (DE-588)4802620-7 |
title | Principles of big data preparing, sharing, and analyzing complex information |
title_auth | Principles of big data preparing, sharing, and analyzing complex information |
title_exact_search | Principles of big data preparing, sharing, and analyzing complex information |
title_full | Principles of big data preparing, sharing, and analyzing complex information Jules J. Berman |
title_fullStr | Principles of big data preparing, sharing, and analyzing complex information Jules J. Berman |
title_full_unstemmed | Principles of big data preparing, sharing, and analyzing complex information Jules J. Berman |
title_short | Principles of big data |
title_sort | principles of big data preparing sharing and analyzing complex information |
title_sub | preparing, sharing, and analyzing complex information |
topic | Datenaufbereitung (DE-588)4148865-9 gnd Datenanalyse (DE-588)4123037-1 gnd Datenmanagement (DE-588)4213132-7 gnd Big Data (DE-588)4802620-7 gnd |
topic_facet | Datenaufbereitung Datenanalyse Datenmanagement Big Data |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026062959&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT bermanjulesj principlesofbigdatapreparingsharingandanalyzingcomplexinformation |