Data mining for the social sciences :: an introduction /
"We live, today, in world of big data. The amount of information collected on human behavior every day is staggering, and exponentially greater than at any time in the past. At the same time, we are inundated by stories of powerful algorithms capable of churning through this sea of data and unc...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Oakland, California :
University of California Press,
[2015]
|
Ausgabe: | First edition. |
Schlagworte: | |
Online-Zugang: | Volltext |
Zusammenfassung: | "We live, today, in world of big data. The amount of information collected on human behavior every day is staggering, and exponentially greater than at any time in the past. At the same time, we are inundated by stories of powerful algorithms capable of churning through this sea of data and uncovering patterns. These techniques go by many names - data mining, predictive analytics, machine learning - and they are being used by governments as they spy on citizens and by huge corporations are they fine-tune their advertising strategies. And yet social scientists continue mainly to employ a set of analytical tools developed in an earlier era when data was sparse and difficult to come by. In this timely book, Paul Attewell and David Monaghan provide a simple and accessible introduction to Data Mining geared towards social scientists. They discuss how the data mining approach differs substantially, and in some ways radically, from that of conventional statistical modeling familiar to most social scientists. They demystify data mining, describing the diverse set of techniques that the term covers and discussing the strengths and weaknesses of the various approaches. Finally they give practical demonstrations of how to carry out analyses using data mining tools in a number of statistical software packages. It is the hope of the authors that this book will empower social scientists to consider incorporating data mining methodologies in their analytical toolkits"--Provided by publisher |
Beschreibung: | 1 online resource (xi, 252 pages) |
Bibliographie: | Includes bibliographical references and index. |
ISBN: | 9780520960596 0520960599 |
Internformat
MARC
LEADER | 00000cam a2200000 i 4500 | ||
---|---|---|---|
001 | ZDB-4-EBA-ocn905221641 | ||
003 | OCoLC | ||
005 | 20241004212047.0 | ||
006 | m o d | ||
007 | cr cnu---unuuu | ||
008 | 150319t20152015cau ob 001 0 eng d | ||
040 | |a N$T |b eng |e rda |e pn |c N$T |d JSTOR |d QGK |d EBLCP |d E7B |d YDXCP |d DEBSZ |d K6U |d COCUF |d CNNOR |d OCLCQ |d CCO |d PIFFA |d FVL |d ZCU |d AGLDB |d MERUC |d OCLCQ |d U3W |d D6H |d UUM |d STF |d VNS |d OCLCQ |d VTS |d ICG |d VT2 |d OCLCQ |d WYU |d G3B |d LVT |d TKN |d DKC |d OCLCQ |d DEGRU |d SFB |d OCLCQ |d MM9 |d OCLCQ |d OCLCO |d COM |d LUU |d OCLCQ |d OCLCO |d OCLCL |d UEJ |d OCLCO |d OCLCQ | ||
019 | |a 905988723 |a 959910162 |a 1055316528 |a 1066614218 |a 1081203914 | ||
020 | |a 9780520960596 |q (electronic bk.) | ||
020 | |a 0520960599 |q (electronic bk.) | ||
020 | |z 9780520280977 | ||
020 | |z 0520280970 | ||
020 | |z 9780520280984 | ||
020 | |z 0520280989 | ||
035 | |a (OCoLC)905221641 |z (OCoLC)905988723 |z (OCoLC)959910162 |z (OCoLC)1055316528 |z (OCoLC)1066614218 |z (OCoLC)1081203914 | ||
037 | |a 22573/ctt13h1jg1 |b JSTOR | ||
050 | 4 | |a H61.3 |b .A88 2015eb | |
072 | 7 | |a COM |x 000000 |2 bisacsh | |
072 | 7 | |a SOC006000 |2 bisacsh | |
082 | 7 | |a 006.3/12 |2 23 | |
049 | |a MAIN | ||
100 | 1 | |a Attewell, Paul A., |d 1949- |e author. |1 https://id.oclc.org/worldcat/entity/E39PCjGKyvghHt4xV38QR4pTf3 |0 http://id.loc.gov/authorities/names/n82214112 | |
245 | 1 | 0 | |a Data mining for the social sciences : |b an introduction / |c Paul Attewell and David B. Monaghan, with Darren Kwong. |
250 | |a First edition. | ||
264 | 1 | |a Oakland, California : |b University of California Press, |c [2015] | |
264 | 4 | |c ©2015 | |
300 | |a 1 online resource (xi, 252 pages) | ||
336 | |a text |b txt |2 rdacontent | ||
337 | |a computer |b c |2 rdamedia | ||
338 | |a online resource |b cr |2 rdacarrier | ||
504 | |a Includes bibliographical references and index. | ||
520 | |a "We live, today, in world of big data. The amount of information collected on human behavior every day is staggering, and exponentially greater than at any time in the past. At the same time, we are inundated by stories of powerful algorithms capable of churning through this sea of data and uncovering patterns. These techniques go by many names - data mining, predictive analytics, machine learning - and they are being used by governments as they spy on citizens and by huge corporations are they fine-tune their advertising strategies. And yet social scientists continue mainly to employ a set of analytical tools developed in an earlier era when data was sparse and difficult to come by. In this timely book, Paul Attewell and David Monaghan provide a simple and accessible introduction to Data Mining geared towards social scientists. They discuss how the data mining approach differs substantially, and in some ways radically, from that of conventional statistical modeling familiar to most social scientists. They demystify data mining, describing the diverse set of techniques that the term covers and discussing the strengths and weaknesses of the various approaches. Finally they give practical demonstrations of how to carry out analyses using data mining tools in a number of statistical software packages. It is the hope of the authors that this book will empower social scientists to consider incorporating data mining methodologies in their analytical toolkits"--Provided by publisher | ||
588 | 0 | |a Online resource; title from PDF title page (Ebsco, viewed June 15, 2015). | |
505 | 0 | |a Cover; Title; Copyright; Contents; Acknowledgments; PART 1. CONCEPTS; 1. What Is Data Mining?; The Goals of This Book; Software and Hardware for Data Mining; Basic Terminology; 2. Contrasts with the Conventional Statistical Approach; Predictive Power in Conventional Statistical Modeling; Hypothesis Testing in the Conventional Approach; Heteroscedasticity as a Threat to Validity in Conventional Modeling; The Challenge of Complex and Nonrandom Samples; Bootstrapping and Permutation Tests; Nonlinearity in Conventional Predictive Models; Statistical Interactions in Conventional Models; Conclusion. | |
505 | 8 | |a 3. Some General Strategies Used in Data MiningCross-Validation; Overfitting; Boosting; Calibrating; Measuring Fit: The Confusion Matrix and ROC Curves; Identifying Statistical Interactions and Effect Heterogeneity in Data Mining; Bagging and Random Forests; The Limits of Prediction; Big Data Is Never Big Enough; 4. Important Stages in a Data Mining Project; When to Sample Big Data; Building a Rich Array of Features; Feature Selection; Feature Extraction; Constructing a Model; PART 2. WORKED EXAMPLES; 5. Preparing Training and Test Datasets ; The Logic of Cross-Validation. | |
505 | 8 | |a Cross-Validation Methods: An Overview6. Variable Selection Tools; Stepwise Regression; The LASSO; VIF Regression; 7. Creating New Variables Using Binning and Trees; Discretizing a Continuous Predictor; Continuous Outcomes and Continuous Predictors; Binning Categorical Predictors; Using Partition Trees to Study Interactions; 8. Extracting Variables; Principal Component Analysis; Independent Component Analysis; 9. Classifiers; K-Nearest Neighbors; Naive Bayes; Support Vector Machines; Optimizing Prediction across Multiple Classifiers; 10. Classification Trees; Partition Trees. | |
505 | 8 | |a Boosted Trees and Random Forests 11. Neural Networks; 12. Clustering; Hierarchical Clustering; K-Means Clustering; Normal Mixtures; Self-Organized Maps; 13. Latent Class Analysis and Mixture Models; Latent Class Analysis; Latent Class Regression; Mixture Models; 14. Association Rules; Conclusion; Bibliography; Notes; Index; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; R; S; T; U; V; W; X; Y; Z. | |
650 | 0 | |a Social sciences |x Data processing. |0 http://id.loc.gov/authorities/subjects/sh85124007 | |
650 | 0 | |a Social sciences |x Statistical methods. |0 http://id.loc.gov/authorities/subjects/sh85124018 | |
650 | 0 | |a Data mining. |0 http://id.loc.gov/authorities/subjects/sh97002073 | |
650 | 2 | |a Data Mining |0 https://id.nlm.nih.gov/mesh/D057225 | |
650 | 6 | |a Sciences sociales |x Informatique. | |
650 | 6 | |a Sciences sociales |x Méthodes statistiques. | |
650 | 6 | |a Exploration de données (Informatique) | |
650 | 7 | |a COMPUTERS |x General. |2 bisacsh | |
650 | 7 | |a SOCIAL SCIENCE |x Demography. |2 bisacsh | |
650 | 7 | |a Data mining |2 fast | |
650 | 7 | |a Social sciences |x Data processing |2 fast | |
650 | 7 | |a Social sciences |x Statistical methods |2 fast | |
700 | 1 | |a Monaghan, David B., |d 1988- |e author. |1 https://id.oclc.org/worldcat/entity/E39PCjGyFXWhq6XPtFP4TxJcvb |0 http://id.loc.gov/authorities/names/no2014117127 | |
700 | 1 | |a Kwong, Darren, |e writer of supplementary textual content. |1 https://id.oclc.org/worldcat/entity/E39PCjrxtFMDKPMPc4WBRwXYbm |0 http://id.loc.gov/authorities/names/no2015071306 | |
776 | 0 | 8 | |i Print version: |a Attewell, Paul A., 1949- |t Data mining for the social sciences. |b First edition |z 9780520280977 |w (DLC) 2014035276 |w (OCoLC)894491465 |
856 | 4 | 0 | |l FWS01 |p ZDB-4-EBA |q FWS_PDA_EBA |u https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=967323 |3 Volltext |
938 | |a De Gruyter |b DEGR |n 9780520960596 | ||
938 | |a EBL - Ebook Library |b EBLB |n EBL1882080 | ||
938 | |a ebrary |b EBRY |n ebr11033069 | ||
938 | |a EBSCOhost |b EBSC |n 967323 | ||
938 | |a YBP Library Services |b YANK |n 12344891 | ||
994 | |a 92 |b GEBAY | ||
912 | |a ZDB-4-EBA | ||
049 | |a DE-863 |
Datensatz im Suchindex
DE-BY-FWS_katkey | ZDB-4-EBA-ocn905221641 |
---|---|
_version_ | 1816882306359492608 |
adam_text | |
any_adam_object | |
author | Attewell, Paul A., 1949- Monaghan, David B., 1988- |
author_GND | http://id.loc.gov/authorities/names/n82214112 http://id.loc.gov/authorities/names/no2014117127 http://id.loc.gov/authorities/names/no2015071306 |
author_facet | Attewell, Paul A., 1949- Monaghan, David B., 1988- |
author_role | aut aut |
author_sort | Attewell, Paul A., 1949- |
author_variant | p a a pa paa d b m db dbm |
building | Verbundindex |
bvnumber | localFWS |
callnumber-first | H - Social Science |
callnumber-label | H61 |
callnumber-raw | H61.3 .A88 2015eb |
callnumber-search | H61.3 .A88 2015eb |
callnumber-sort | H 261.3 A88 42015EB |
callnumber-subject | H - Social Science |
collection | ZDB-4-EBA |
contents | Cover; Title; Copyright; Contents; Acknowledgments; PART 1. CONCEPTS; 1. What Is Data Mining?; The Goals of This Book; Software and Hardware for Data Mining; Basic Terminology; 2. Contrasts with the Conventional Statistical Approach; Predictive Power in Conventional Statistical Modeling; Hypothesis Testing in the Conventional Approach; Heteroscedasticity as a Threat to Validity in Conventional Modeling; The Challenge of Complex and Nonrandom Samples; Bootstrapping and Permutation Tests; Nonlinearity in Conventional Predictive Models; Statistical Interactions in Conventional Models; Conclusion. 3. Some General Strategies Used in Data MiningCross-Validation; Overfitting; Boosting; Calibrating; Measuring Fit: The Confusion Matrix and ROC Curves; Identifying Statistical Interactions and Effect Heterogeneity in Data Mining; Bagging and Random Forests; The Limits of Prediction; Big Data Is Never Big Enough; 4. Important Stages in a Data Mining Project; When to Sample Big Data; Building a Rich Array of Features; Feature Selection; Feature Extraction; Constructing a Model; PART 2. WORKED EXAMPLES; 5. Preparing Training and Test Datasets ; The Logic of Cross-Validation. Cross-Validation Methods: An Overview6. Variable Selection Tools; Stepwise Regression; The LASSO; VIF Regression; 7. Creating New Variables Using Binning and Trees; Discretizing a Continuous Predictor; Continuous Outcomes and Continuous Predictors; Binning Categorical Predictors; Using Partition Trees to Study Interactions; 8. Extracting Variables; Principal Component Analysis; Independent Component Analysis; 9. Classifiers; K-Nearest Neighbors; Naive Bayes; Support Vector Machines; Optimizing Prediction across Multiple Classifiers; 10. Classification Trees; Partition Trees. Boosted Trees and Random Forests 11. Neural Networks; 12. Clustering; Hierarchical Clustering; K-Means Clustering; Normal Mixtures; Self-Organized Maps; 13. Latent Class Analysis and Mixture Models; Latent Class Analysis; Latent Class Regression; Mixture Models; 14. Association Rules; Conclusion; Bibliography; Notes; Index; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; R; S; T; U; V; W; X; Y; Z. |
ctrlnum | (OCoLC)905221641 |
dewey-full | 006.3/12 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3/12 |
dewey-search | 006.3/12 |
dewey-sort | 16.3 212 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | First edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>07266cam a2200733 i 4500</leader><controlfield tag="001">ZDB-4-EBA-ocn905221641</controlfield><controlfield tag="003">OCoLC</controlfield><controlfield tag="005">20241004212047.0</controlfield><controlfield tag="006">m o d </controlfield><controlfield tag="007">cr cnu---unuuu</controlfield><controlfield tag="008">150319t20152015cau ob 001 0 eng d</controlfield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">N$T</subfield><subfield code="b">eng</subfield><subfield code="e">rda</subfield><subfield code="e">pn</subfield><subfield code="c">N$T</subfield><subfield code="d">JSTOR</subfield><subfield code="d">QGK</subfield><subfield code="d">EBLCP</subfield><subfield code="d">E7B</subfield><subfield code="d">YDXCP</subfield><subfield code="d">DEBSZ</subfield><subfield code="d">K6U</subfield><subfield code="d">COCUF</subfield><subfield code="d">CNNOR</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">CCO</subfield><subfield code="d">PIFFA</subfield><subfield code="d">FVL</subfield><subfield code="d">ZCU</subfield><subfield code="d">AGLDB</subfield><subfield code="d">MERUC</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">U3W</subfield><subfield code="d">D6H</subfield><subfield code="d">UUM</subfield><subfield code="d">STF</subfield><subfield code="d">VNS</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">VTS</subfield><subfield code="d">ICG</subfield><subfield code="d">VT2</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">WYU</subfield><subfield code="d">G3B</subfield><subfield code="d">LVT</subfield><subfield code="d">TKN</subfield><subfield code="d">DKC</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">DEGRU</subfield><subfield code="d">SFB</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">MM9</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">OCLCO</subfield><subfield code="d">COM</subfield><subfield code="d">LUU</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">OCLCO</subfield><subfield code="d">OCLCL</subfield><subfield code="d">UEJ</subfield><subfield code="d">OCLCO</subfield><subfield code="d">OCLCQ</subfield></datafield><datafield tag="019" ind1=" " ind2=" "><subfield code="a">905988723</subfield><subfield code="a">959910162</subfield><subfield code="a">1055316528</subfield><subfield code="a">1066614218</subfield><subfield code="a">1081203914</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780520960596</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0520960599</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9780520280977</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">0520280970</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9780520280984</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">0520280989</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)905221641</subfield><subfield code="z">(OCoLC)905988723</subfield><subfield code="z">(OCoLC)959910162</subfield><subfield code="z">(OCoLC)1055316528</subfield><subfield code="z">(OCoLC)1066614218</subfield><subfield code="z">(OCoLC)1081203914</subfield></datafield><datafield tag="037" ind1=" " ind2=" "><subfield code="a">22573/ctt13h1jg1</subfield><subfield code="b">JSTOR</subfield></datafield><datafield tag="050" ind1=" " ind2="4"><subfield code="a">H61.3</subfield><subfield code="b">.A88 2015eb</subfield></datafield><datafield tag="072" ind1=" " ind2="7"><subfield code="a">COM</subfield><subfield code="x">000000</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="072" ind1=" " ind2="7"><subfield code="a">SOC006000</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="082" ind1="7" ind2=" "><subfield code="a">006.3/12</subfield><subfield code="2">23</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">MAIN</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Attewell, Paul A.,</subfield><subfield code="d">1949-</subfield><subfield code="e">author.</subfield><subfield code="1">https://id.oclc.org/worldcat/entity/E39PCjGKyvghHt4xV38QR4pTf3</subfield><subfield code="0">http://id.loc.gov/authorities/names/n82214112</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Data mining for the social sciences :</subfield><subfield code="b">an introduction /</subfield><subfield code="c">Paul Attewell and David B. Monaghan, with Darren Kwong.</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Oakland, California :</subfield><subfield code="b">University of California Press,</subfield><subfield code="c">[2015]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">©2015</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 online resource (xi, 252 pages)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">computer</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">online resource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="504" ind1=" " ind2=" "><subfield code="a">Includes bibliographical references and index.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">"We live, today, in world of big data. The amount of information collected on human behavior every day is staggering, and exponentially greater than at any time in the past. At the same time, we are inundated by stories of powerful algorithms capable of churning through this sea of data and uncovering patterns. These techniques go by many names - data mining, predictive analytics, machine learning - and they are being used by governments as they spy on citizens and by huge corporations are they fine-tune their advertising strategies. And yet social scientists continue mainly to employ a set of analytical tools developed in an earlier era when data was sparse and difficult to come by. In this timely book, Paul Attewell and David Monaghan provide a simple and accessible introduction to Data Mining geared towards social scientists. They discuss how the data mining approach differs substantially, and in some ways radically, from that of conventional statistical modeling familiar to most social scientists. They demystify data mining, describing the diverse set of techniques that the term covers and discussing the strengths and weaknesses of the various approaches. Finally they give practical demonstrations of how to carry out analyses using data mining tools in a number of statistical software packages. It is the hope of the authors that this book will empower social scientists to consider incorporating data mining methodologies in their analytical toolkits"--Provided by publisher</subfield></datafield><datafield tag="588" ind1="0" ind2=" "><subfield code="a">Online resource; title from PDF title page (Ebsco, viewed June 15, 2015).</subfield></datafield><datafield tag="505" ind1="0" ind2=" "><subfield code="a">Cover; Title; Copyright; Contents; Acknowledgments; PART 1. CONCEPTS; 1. What Is Data Mining?; The Goals of This Book; Software and Hardware for Data Mining; Basic Terminology; 2. Contrasts with the Conventional Statistical Approach; Predictive Power in Conventional Statistical Modeling; Hypothesis Testing in the Conventional Approach; Heteroscedasticity as a Threat to Validity in Conventional Modeling; The Challenge of Complex and Nonrandom Samples; Bootstrapping and Permutation Tests; Nonlinearity in Conventional Predictive Models; Statistical Interactions in Conventional Models; Conclusion.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">3. Some General Strategies Used in Data MiningCross-Validation; Overfitting; Boosting; Calibrating; Measuring Fit: The Confusion Matrix and ROC Curves; Identifying Statistical Interactions and Effect Heterogeneity in Data Mining; Bagging and Random Forests; The Limits of Prediction; Big Data Is Never Big Enough; 4. Important Stages in a Data Mining Project; When to Sample Big Data; Building a Rich Array of Features; Feature Selection; Feature Extraction; Constructing a Model; PART 2. WORKED EXAMPLES; 5. Preparing Training and Test Datasets ; The Logic of Cross-Validation.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Cross-Validation Methods: An Overview6. Variable Selection Tools; Stepwise Regression; The LASSO; VIF Regression; 7. Creating New Variables Using Binning and Trees; Discretizing a Continuous Predictor; Continuous Outcomes and Continuous Predictors; Binning Categorical Predictors; Using Partition Trees to Study Interactions; 8. Extracting Variables; Principal Component Analysis; Independent Component Analysis; 9. Classifiers; K-Nearest Neighbors; Naive Bayes; Support Vector Machines; Optimizing Prediction across Multiple Classifiers; 10. Classification Trees; Partition Trees.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Boosted Trees and Random Forests 11. Neural Networks; 12. Clustering; Hierarchical Clustering; K-Means Clustering; Normal Mixtures; Self-Organized Maps; 13. Latent Class Analysis and Mixture Models; Latent Class Analysis; Latent Class Regression; Mixture Models; 14. Association Rules; Conclusion; Bibliography; Notes; Index; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; R; S; T; U; V; W; X; Y; Z.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Social sciences</subfield><subfield code="x">Data processing.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh85124007</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Social sciences</subfield><subfield code="x">Statistical methods.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh85124018</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Data mining.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh97002073</subfield></datafield><datafield tag="650" ind1=" " ind2="2"><subfield code="a">Data Mining</subfield><subfield code="0">https://id.nlm.nih.gov/mesh/D057225</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Sciences sociales</subfield><subfield code="x">Informatique.</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Sciences sociales</subfield><subfield code="x">Méthodes statistiques.</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Exploration de données (Informatique)</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">COMPUTERS</subfield><subfield code="x">General.</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">SOCIAL SCIENCE</subfield><subfield code="x">Demography.</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Data mining</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Social sciences</subfield><subfield code="x">Data processing</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Social sciences</subfield><subfield code="x">Statistical methods</subfield><subfield code="2">fast</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Monaghan, David B.,</subfield><subfield code="d">1988-</subfield><subfield code="e">author.</subfield><subfield code="1">https://id.oclc.org/worldcat/entity/E39PCjGyFXWhq6XPtFP4TxJcvb</subfield><subfield code="0">http://id.loc.gov/authorities/names/no2014117127</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kwong, Darren,</subfield><subfield code="e">writer of supplementary textual content.</subfield><subfield code="1">https://id.oclc.org/worldcat/entity/E39PCjrxtFMDKPMPc4WBRwXYbm</subfield><subfield code="0">http://id.loc.gov/authorities/names/no2015071306</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Print version:</subfield><subfield code="a">Attewell, Paul A., 1949-</subfield><subfield code="t">Data mining for the social sciences.</subfield><subfield code="b">First edition</subfield><subfield code="z">9780520280977</subfield><subfield code="w">(DLC) 2014035276</subfield><subfield code="w">(OCoLC)894491465</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="l">FWS01</subfield><subfield code="p">ZDB-4-EBA</subfield><subfield code="q">FWS_PDA_EBA</subfield><subfield code="u">https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=967323</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">De Gruyter</subfield><subfield code="b">DEGR</subfield><subfield code="n">9780520960596</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">EBL - Ebook Library</subfield><subfield code="b">EBLB</subfield><subfield code="n">EBL1882080</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">ebrary</subfield><subfield code="b">EBRY</subfield><subfield code="n">ebr11033069</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">EBSCOhost</subfield><subfield code="b">EBSC</subfield><subfield code="n">967323</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">YBP Library Services</subfield><subfield code="b">YANK</subfield><subfield code="n">12344891</subfield></datafield><datafield tag="994" ind1=" " ind2=" "><subfield code="a">92</subfield><subfield code="b">GEBAY</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-4-EBA</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-863</subfield></datafield></record></collection> |
id | ZDB-4-EBA-ocn905221641 |
illustrated | Not Illustrated |
indexdate | 2024-11-27T13:26:31Z |
institution | BVB |
isbn | 9780520960596 0520960599 |
language | English |
oclc_num | 905221641 |
open_access_boolean | |
owner | MAIN DE-863 DE-BY-FWS |
owner_facet | MAIN DE-863 DE-BY-FWS |
physical | 1 online resource (xi, 252 pages) |
psigel | ZDB-4-EBA |
publishDate | 2015 |
publishDateSearch | 2015 |
publishDateSort | 2015 |
publisher | University of California Press, |
record_format | marc |
spelling | Attewell, Paul A., 1949- author. https://id.oclc.org/worldcat/entity/E39PCjGKyvghHt4xV38QR4pTf3 http://id.loc.gov/authorities/names/n82214112 Data mining for the social sciences : an introduction / Paul Attewell and David B. Monaghan, with Darren Kwong. First edition. Oakland, California : University of California Press, [2015] ©2015 1 online resource (xi, 252 pages) text txt rdacontent computer c rdamedia online resource cr rdacarrier Includes bibliographical references and index. "We live, today, in world of big data. The amount of information collected on human behavior every day is staggering, and exponentially greater than at any time in the past. At the same time, we are inundated by stories of powerful algorithms capable of churning through this sea of data and uncovering patterns. These techniques go by many names - data mining, predictive analytics, machine learning - and they are being used by governments as they spy on citizens and by huge corporations are they fine-tune their advertising strategies. And yet social scientists continue mainly to employ a set of analytical tools developed in an earlier era when data was sparse and difficult to come by. In this timely book, Paul Attewell and David Monaghan provide a simple and accessible introduction to Data Mining geared towards social scientists. They discuss how the data mining approach differs substantially, and in some ways radically, from that of conventional statistical modeling familiar to most social scientists. They demystify data mining, describing the diverse set of techniques that the term covers and discussing the strengths and weaknesses of the various approaches. Finally they give practical demonstrations of how to carry out analyses using data mining tools in a number of statistical software packages. It is the hope of the authors that this book will empower social scientists to consider incorporating data mining methodologies in their analytical toolkits"--Provided by publisher Online resource; title from PDF title page (Ebsco, viewed June 15, 2015). Cover; Title; Copyright; Contents; Acknowledgments; PART 1. CONCEPTS; 1. What Is Data Mining?; The Goals of This Book; Software and Hardware for Data Mining; Basic Terminology; 2. Contrasts with the Conventional Statistical Approach; Predictive Power in Conventional Statistical Modeling; Hypothesis Testing in the Conventional Approach; Heteroscedasticity as a Threat to Validity in Conventional Modeling; The Challenge of Complex and Nonrandom Samples; Bootstrapping and Permutation Tests; Nonlinearity in Conventional Predictive Models; Statistical Interactions in Conventional Models; Conclusion. 3. Some General Strategies Used in Data MiningCross-Validation; Overfitting; Boosting; Calibrating; Measuring Fit: The Confusion Matrix and ROC Curves; Identifying Statistical Interactions and Effect Heterogeneity in Data Mining; Bagging and Random Forests; The Limits of Prediction; Big Data Is Never Big Enough; 4. Important Stages in a Data Mining Project; When to Sample Big Data; Building a Rich Array of Features; Feature Selection; Feature Extraction; Constructing a Model; PART 2. WORKED EXAMPLES; 5. Preparing Training and Test Datasets ; The Logic of Cross-Validation. Cross-Validation Methods: An Overview6. Variable Selection Tools; Stepwise Regression; The LASSO; VIF Regression; 7. Creating New Variables Using Binning and Trees; Discretizing a Continuous Predictor; Continuous Outcomes and Continuous Predictors; Binning Categorical Predictors; Using Partition Trees to Study Interactions; 8. Extracting Variables; Principal Component Analysis; Independent Component Analysis; 9. Classifiers; K-Nearest Neighbors; Naive Bayes; Support Vector Machines; Optimizing Prediction across Multiple Classifiers; 10. Classification Trees; Partition Trees. Boosted Trees and Random Forests 11. Neural Networks; 12. Clustering; Hierarchical Clustering; K-Means Clustering; Normal Mixtures; Self-Organized Maps; 13. Latent Class Analysis and Mixture Models; Latent Class Analysis; Latent Class Regression; Mixture Models; 14. Association Rules; Conclusion; Bibliography; Notes; Index; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; R; S; T; U; V; W; X; Y; Z. Social sciences Data processing. http://id.loc.gov/authorities/subjects/sh85124007 Social sciences Statistical methods. http://id.loc.gov/authorities/subjects/sh85124018 Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Data Mining https://id.nlm.nih.gov/mesh/D057225 Sciences sociales Informatique. Sciences sociales Méthodes statistiques. Exploration de données (Informatique) COMPUTERS General. bisacsh SOCIAL SCIENCE Demography. bisacsh Data mining fast Social sciences Data processing fast Social sciences Statistical methods fast Monaghan, David B., 1988- author. https://id.oclc.org/worldcat/entity/E39PCjGyFXWhq6XPtFP4TxJcvb http://id.loc.gov/authorities/names/no2014117127 Kwong, Darren, writer of supplementary textual content. https://id.oclc.org/worldcat/entity/E39PCjrxtFMDKPMPc4WBRwXYbm http://id.loc.gov/authorities/names/no2015071306 Print version: Attewell, Paul A., 1949- Data mining for the social sciences. First edition 9780520280977 (DLC) 2014035276 (OCoLC)894491465 FWS01 ZDB-4-EBA FWS_PDA_EBA https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=967323 Volltext |
spellingShingle | Attewell, Paul A., 1949- Monaghan, David B., 1988- Data mining for the social sciences : an introduction / Cover; Title; Copyright; Contents; Acknowledgments; PART 1. CONCEPTS; 1. What Is Data Mining?; The Goals of This Book; Software and Hardware for Data Mining; Basic Terminology; 2. Contrasts with the Conventional Statistical Approach; Predictive Power in Conventional Statistical Modeling; Hypothesis Testing in the Conventional Approach; Heteroscedasticity as a Threat to Validity in Conventional Modeling; The Challenge of Complex and Nonrandom Samples; Bootstrapping and Permutation Tests; Nonlinearity in Conventional Predictive Models; Statistical Interactions in Conventional Models; Conclusion. 3. Some General Strategies Used in Data MiningCross-Validation; Overfitting; Boosting; Calibrating; Measuring Fit: The Confusion Matrix and ROC Curves; Identifying Statistical Interactions and Effect Heterogeneity in Data Mining; Bagging and Random Forests; The Limits of Prediction; Big Data Is Never Big Enough; 4. Important Stages in a Data Mining Project; When to Sample Big Data; Building a Rich Array of Features; Feature Selection; Feature Extraction; Constructing a Model; PART 2. WORKED EXAMPLES; 5. Preparing Training and Test Datasets ; The Logic of Cross-Validation. Cross-Validation Methods: An Overview6. Variable Selection Tools; Stepwise Regression; The LASSO; VIF Regression; 7. Creating New Variables Using Binning and Trees; Discretizing a Continuous Predictor; Continuous Outcomes and Continuous Predictors; Binning Categorical Predictors; Using Partition Trees to Study Interactions; 8. Extracting Variables; Principal Component Analysis; Independent Component Analysis; 9. Classifiers; K-Nearest Neighbors; Naive Bayes; Support Vector Machines; Optimizing Prediction across Multiple Classifiers; 10. Classification Trees; Partition Trees. Boosted Trees and Random Forests 11. Neural Networks; 12. Clustering; Hierarchical Clustering; K-Means Clustering; Normal Mixtures; Self-Organized Maps; 13. Latent Class Analysis and Mixture Models; Latent Class Analysis; Latent Class Regression; Mixture Models; 14. Association Rules; Conclusion; Bibliography; Notes; Index; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; R; S; T; U; V; W; X; Y; Z. Social sciences Data processing. http://id.loc.gov/authorities/subjects/sh85124007 Social sciences Statistical methods. http://id.loc.gov/authorities/subjects/sh85124018 Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Data Mining https://id.nlm.nih.gov/mesh/D057225 Sciences sociales Informatique. Sciences sociales Méthodes statistiques. Exploration de données (Informatique) COMPUTERS General. bisacsh SOCIAL SCIENCE Demography. bisacsh Data mining fast Social sciences Data processing fast Social sciences Statistical methods fast |
subject_GND | http://id.loc.gov/authorities/subjects/sh85124007 http://id.loc.gov/authorities/subjects/sh85124018 http://id.loc.gov/authorities/subjects/sh97002073 https://id.nlm.nih.gov/mesh/D057225 |
title | Data mining for the social sciences : an introduction / |
title_auth | Data mining for the social sciences : an introduction / |
title_exact_search | Data mining for the social sciences : an introduction / |
title_full | Data mining for the social sciences : an introduction / Paul Attewell and David B. Monaghan, with Darren Kwong. |
title_fullStr | Data mining for the social sciences : an introduction / Paul Attewell and David B. Monaghan, with Darren Kwong. |
title_full_unstemmed | Data mining for the social sciences : an introduction / Paul Attewell and David B. Monaghan, with Darren Kwong. |
title_short | Data mining for the social sciences : |
title_sort | data mining for the social sciences an introduction |
title_sub | an introduction / |
topic | Social sciences Data processing. http://id.loc.gov/authorities/subjects/sh85124007 Social sciences Statistical methods. http://id.loc.gov/authorities/subjects/sh85124018 Data mining. http://id.loc.gov/authorities/subjects/sh97002073 Data Mining https://id.nlm.nih.gov/mesh/D057225 Sciences sociales Informatique. Sciences sociales Méthodes statistiques. Exploration de données (Informatique) COMPUTERS General. bisacsh SOCIAL SCIENCE Demography. bisacsh Data mining fast Social sciences Data processing fast Social sciences Statistical methods fast |
topic_facet | Social sciences Data processing. Social sciences Statistical methods. Data mining. Data Mining Sciences sociales Informatique. Sciences sociales Méthodes statistiques. Exploration de données (Informatique) COMPUTERS General. SOCIAL SCIENCE Demography. Data mining Social sciences Data processing Social sciences Statistical methods |
url | https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=967323 |
work_keys_str_mv | AT attewellpaula dataminingforthesocialsciencesanintroduction AT monaghandavidb dataminingforthesocialsciencesanintroduction AT kwongdarren dataminingforthesocialsciencesanintroduction |