Machine learning with R:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Singapore
Springer
[2017]
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xix, 210 Seiten Diagramme |
ISBN: | 9789811068072 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV045236870 | ||
003 | DE-604 | ||
005 | 20221019 | ||
007 | t | ||
008 | 181017s2017 |||| |||| 00||| eng d | ||
020 | |a 9789811068072 |9 978-981-10-6807-2 | ||
035 | |a (OCoLC)1048370485 | ||
035 | |a (DE-599)BVBBV045236870 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-355 |a DE-945 |a DE-19 |a DE-11 | ||
082 | 0 | |a 006.3 |2 23 | |
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
084 | |a ST 250 R01 |2 sdnb | ||
100 | 1 | |a Ghatak, Abhijit |e Verfasser |0 (DE-588)1173132031 |4 aut | |
245 | 1 | 0 | |a Machine learning with R |c Abhijit Ghatak |
264 | 1 | |a Singapore |b Springer |c [2017] | |
264 | 4 | |c © 2017 | |
300 | |a xix, 210 Seiten |b Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a R |g Programm |0 (DE-588)4705956-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | 1 | |a R |g Programm |0 (DE-588)4705956-4 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-981-10-6808-9 |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030625140&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-030625140 |
Datensatz im Suchindex
_version_ | 1804178968393285632 |
---|---|
adam_text | Contents
Preface............................................................... vii
1 Linear Algebra, Numerical Optimization, and Its Applications
in Machine Learning ............................................. 1
1.1 Scalars, Vectors, and Linear Functions.......................... 1
1.1.1 Scalars................................................. 1
1.1.2 Vectors................................................. 1
1.2 Linear Functions................................................ 4
1.3 Matrices........................................................ 4
1.3.1 Transpose of a Matrix................................... 4
1.3.2 Identity Matrix......................................... 4
1.3.3 Inverse of a Matrix..................................... 5
1.3.4 Representing Linear Equations in Matrix Form............ 5
1.4 Matrix Transformations.......................................... 6
1.5 Noons........................................................... 7
1.5.1 ¿2 Optimization......................................... 8
1.5.2 i Optimization......................................... 9
1.6 Rewriting the Regression Model in Matrix Notation............... 9
1.7 Cost of a n-Dimensional Function............................. 10
1.8 Computing the Gradient of the Cost............................. 11
1.8.1 Closed-Form Solution................................... 11
1.8.2 Gradient Descent....................................... 12
1.9 An Example of Gradient Descent Optimization.................... 13
1.10 Eigendecomposition............................................. 14
1.11 Singular Value Decomposition (SVD)............................. 18
1.12 Principal Component Analysis (PCA)............................. 21
1.12.1 PCA and SVD............................................ 22
1.13 Computational Errors........................................... 27
1.13.1 Rounding—-Overflow and Underflow....................... 28
1.13.2 Conditioning........................................... 28
1.14 Numerical Optimization......................................... 29
xiii
xiv
Contents
2 Probability and Distributions...........................................
2.1 Sources of Uncertainty........................................... ^
2.2 Random Experiment................................................ ^
2.3 Probability.............;........................................ ^
2.3.1 Marginal Probability..................................... 33
2.3.2 Conditional Probability.................................. 34
2.3.3 The Chain Rule.......................*................... 34
2.4 Bayes’ Rule.............................*........................ 35
2.5 Probability Distribution.........................................* 37
2.5.1 Discrete Probability Distribution........................ 37
2.5.2 Continuous Probability Distribution...................... 37
2.5.3 Cumulative Probability Distribution...................... 37
2.5.4 Joint Probability Distribution........................... 38
2.6 Measures of Central Tendency..................................... 38
2.7 Dispersion.......................................................... 39
2.8 Covariance and Correlation.......................................... 39
2.9 Shape of a Distribution.......................................... 41
2.10 Chebyshev’s Inequality........................................... 41
2.11 Common Probability Distributions.................................... 42
2.11.1 Discrete Distributions...................................... 42
2.11.2 Continuous Distributions.................................. 43
2.11.3 Summary of Probability Distributions........................ 45
2.12 Tests for Fit....................................................... 46
2.12.1 Chi-Square Distribution..................................... 47
2.12.2 Chi-Square Test............................................. 48
2.13 Ratio Distributions................................................. 50
2.13.1 Student’s t-Distribution.................................... 51
2.13.2 F-Distribution ............................................. 54
3 Introduction to Machine Learning........................................ 57
3.1 Scientific Enquiry................................................ 58
3.1.1 Empirical Science........................................... 58
3.1.2 Theoretical Science......................................... 59
3.1.3 Computational Science....................................... 59
3.1.4 e-Science................................................... 59
3.2 Machine Learning.................................................... 59
3.2.1 A Learning Task............................................. 60
3.2.2 The Performance Measure................................... 60
3.2.3 The Experience.............................................. 61
3.3 Train and Test Data................................................. 61
3.3.1 Training Error, Generalization (True) Error,
and Test Error............................................ 61
Contents
xv
3.4 Irreducible Error, Bias, and Variance........................... 64
3.5 Bias-Variance Trade-off......................................... 66
3.6 Deriving the Expected Prediction Error.......................... 67
3.7 Underfitting and Overfitting.................................... 68
3.8 Regularization.................................................. 69
3.9 Hyperparameters................................................. 71
3.10 Cross-Validation................................................ 72
3.11 Maximum Likelihood Estimation................................... 72
3.12 Gradient Descent................................................ 75
3.13 Building a Machine Learning Algorithm........................... 76
3.13.1 Challenges in Learning Algorithms....................... 77
3.13.2 Curse of Dimensionality and Feature Engineering...... 77
3.14 Conclusion...................................................... 78
4 Regression.......................................................... 79
4.1 Linear Regression............................................... 79
4.1.1 Hypothesis Function..................................... 79
4.1.2 Cost Function........................................... 80
4.2 Linear Regression as Ordinary Least Squares..................... 81
4.3 Linear Regression as Maximum Likelihood......................... 83
4.4 Gradient Descent................................................ 84
4.4.1 Gradient of RSS......................................... 84
4.4.2 Closed Form Solution.................................... 84
4.4.3 Step-by-Step Batch Gradient Descent..................... 84
4.4.4 Writing the Batch Gradient Descent Application....... 85
4.4.5 Writing the Stochastic Gradient
Descent Application..................................... 89
4.5 Linear Regression Assumptions................................... 90
4.6 Summary of Regression Outputs................................... 93
4.7 Ridge Regression................................................ 95
4.7.1 Computing the Gradient of Ridge Regression.............. 97
4.7.2 Writing the Ridge Regression Gradient Descent
Application............................................. 99
4.8 Assessing Performance.......................................... 103
4.8.1 Sources of Error Revisited............................. 104
4.8.2 Bias-Variance Trade-Off in Ridge Regression............ 106
4.9 Lasso Regression............................................... 107
4.9.1 Coordinate Descent for Least Squares Regression...... 108
4.9.2 Coordinate Descent for Lasso........................... 109
4.9.3 Writing the Lasso Coordinate Descent Application..... 110
4.9.4 Implementing Coordinate Descent........................ 112
4.9.5 Bias Variance Trade-Off in Lasso Regression............ 113
Contents
5 Classification...................................................
5.1 Linear Classifiers.........................................
5.1.1 Linear Classifier Model...........................
5.1.2 Interpreting the Score............................
5.2 Logistic Regression........................................
5.2.1 Likelihood Function...............................
5.2.2 Model Selection with Log-Likelihood...............
5.2.3 Gradient Ascent to Find the Best Linear Classifier . .
5.2.4 Deriving the Log-Likelihood Function..............
5.2.5 Deriving the Gradient of Log-Likelihood...........
5.2.6 Gradient Ascent for Logistic Regression...........
5.2.7 Writing the Logistic Regression Application.......
5.2.8 A Comparison Using the BFGS Optimization
Method............................................
5.2.9 Regularization....................................
5.2.10 t.z Regularized Logistic Regression...............
5.2.11 li Regularized Logistic Regression with Gradient
Ascent............................................
5.2.12 Writing the Ridge Logistic Regression with Gradient
Ascent Application................................
5.2.13 Writing the Lasso Regularized Logistic Regression
With Gradient Ascent Application..................
5.3 Decision Trees...........................................
5.3.1 Decision Tree Algorithm...........................
5.3.2 Overfitting in Decision Trees.....................
5.3.3 Control of Tree Parameters........................
5.3.4 Writing the Decision Tree Application.............
5.3.5 Unbalanced Data...................................
5.4 Assessing Performance....................................
5.4.1 Assessing Performance-Logistic Regression.........
5.5 Boosting.................................................
5.5.1 AdaBoost Learning Ensemble........................
5.5.2 AdaBoost: Learning from Weighted Data.............
5.5.3 AdaBoost: Updating the Weights....................
5.5.4 AdaBoost Algorithm................................
5.5.5 Writing the Weighted Decision Tree Algorithm . . .
5.5.6 Writing the AdaBoost Application..................
5.5.7 Performance of our AdaBoost Algorithm.............
5.6 Other Variants......................
5.6.1 Bagging.................................
5.6.2 Gradient Boosting ....
5.6.3 XGBoost..........
115
115
116
117
117
120
120
121
122
124
125
125
129
131
131
133
133
138
143
145
145
146
147
152
153
155
158
160
160
161
162
162
168
172
175
175
176
176
Contents
xvn
6 Clustering............................................................. 179
6.1 The Clustering Algorithm....................................... 180
6.2 Clustering Algorithm as Coordinate Descent optimization....... 180
6.3 An Introduction to Text mining................................ 181
6.3.1 Text Mining Application—Reading Multiple Text
Files from Multiple Directories........................ 181
6.3.2 Text Mining Application—Creating a Weighted tf-idf
Document-Term Matrix................................... 182
6.3.3 Text Mining Application—Exploratory Analysis........... 183
6.4 Writing the Clustering Application............................. 183
6.4.1 Smart Initialization of k-means........................ 193
6.4.2 Writing the fc-means+-h Application................... 193
6.4.3 Finding the Optimal Number of Centroids............... 199
6.5 Topic Modeling................................................. 201
6.5.1 Clustering and Topic Modeling.......................... 201
6.5.2 Latent Dirichlet Allocation for Topic Modeling........ 202
References and Further Reading.......................................... 209
|
any_adam_object | 1 |
author | Ghatak, Abhijit |
author_GND | (DE-588)1173132031 |
author_facet | Ghatak, Abhijit |
author_role | aut |
author_sort | Ghatak, Abhijit |
author_variant | a g ag |
building | Verbundindex |
bvnumber | BV045236870 |
classification_rvk | ST 300 ST 250 |
ctrlnum | (OCoLC)1048370485 (DE-599)BVBBV045236870 |
dewey-full | 006.3 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3 |
dewey-search | 006.3 |
dewey-sort | 16.3 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01530nam a2200397 c 4500</leader><controlfield tag="001">BV045236870</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20221019 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">181017s2017 |||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789811068072</subfield><subfield code="9">978-981-10-6807-2</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1048370485</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045236870</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield><subfield code="a">DE-945</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-11</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3</subfield><subfield code="2">23</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250 R01</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ghatak, Abhijit</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1173132031</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Machine learning with R</subfield><subfield code="c">Abhijit Ghatak</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Singapore</subfield><subfield code="b">Springer</subfield><subfield code="c">[2017]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2017</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xix, 210 Seiten</subfield><subfield code="b">Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-981-10-6808-9</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030625140&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030625140</subfield></datafield></record></collection> |
id | DE-604.BV045236870 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T08:12:24Z |
institution | BVB |
isbn | 9789811068072 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030625140 |
oclc_num | 1048370485 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR DE-945 DE-19 DE-BY-UBM DE-11 |
owner_facet | DE-355 DE-BY-UBR DE-945 DE-19 DE-BY-UBM DE-11 |
physical | xix, 210 Seiten Diagramme |
publishDate | 2017 |
publishDateSearch | 2017 |
publishDateSort | 2017 |
publisher | Springer |
record_format | marc |
spelling | Ghatak, Abhijit Verfasser (DE-588)1173132031 aut Machine learning with R Abhijit Ghatak Singapore Springer [2017] © 2017 xix, 210 Seiten Diagramme txt rdacontent n rdamedia nc rdacarrier R Programm (DE-588)4705956-4 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 s R Programm (DE-588)4705956-4 s DE-604 Erscheint auch als Online-Ausgabe 978-981-10-6808-9 Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030625140&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Ghatak, Abhijit Machine learning with R R Programm (DE-588)4705956-4 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)4705956-4 (DE-588)4193754-5 |
title | Machine learning with R |
title_auth | Machine learning with R |
title_exact_search | Machine learning with R |
title_full | Machine learning with R Abhijit Ghatak |
title_fullStr | Machine learning with R Abhijit Ghatak |
title_full_unstemmed | Machine learning with R Abhijit Ghatak |
title_short | Machine learning with R |
title_sort | machine learning with r |
topic | R Programm (DE-588)4705956-4 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | R Programm Maschinelles Lernen |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030625140&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT ghatakabhijit machinelearningwithr |