Verfügbarkeit: Machine learning for knowledge discovery with R

Machine learning for knowledge discovery with R: methodologies for modeling, inference and prediction

"Machine Learning for Knowledge Discovery with R contains methodologies and examples for statistical modelling, inference, and prediction of data analysis. It includes many recent supervised and unsupervised machine learning methodologies such as recursive partitioning modelling, regularized re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Tsai, Kao-Tai (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Boca Raton ; London ; New York CRC Press 2022
Ausgabe:	First edition
Schlagworte:	Data mining / Methodology Machine learning R (Computer program language) Maschinelles Lernen R > Programm
Online-Zugang:	Inhaltsverzeichnis
Zusammenfassung:	"Machine Learning for Knowledge Discovery with R contains methodologies and examples for statistical modelling, inference, and prediction of data analysis. It includes many recent supervised and unsupervised machine learning methodologies such as recursive partitioning modelling, regularized regression, support vector machine, neural network, clustering, and causal-effect inference. Additionally, it emphasizes statistical thinking of data analysis, use of statistical graphs for data structure exploration, and result presentations. The book includes many real-world data examples from life-science, finance, etc. to illustrate the applications of the methods described therein"--
Beschreibung:	XV, 244 Seiten Diagramme 24 cm
ISBN:	9781032065366 9781032071596

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV047657454
003	DE-604
005	20231201
007	t
008	220103s2022 \|\|\|\| \|\|\|\| 00\|\|\| eng d
015			\|a GBC1B4055 \|2 dnb
020			\|a 9781032065366 \|c hbk \|9 978-1-032-06536-6
020			\|a 9781032071596 \|c pbk \|9 978-1-032-07159-6
035			\|a (OCoLC)1296280482
035			\|a (DE-599)BVBBV047657454
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-739
084			\|a ST 530 \|0 (DE-625)143679: \|2 rvk
100	1		\|a Tsai, Kao-Tai \|e Verfasser \|0 (DE-588)1245968041 \|4 aut
245	1	0	\|a Machine learning for knowledge discovery with R \|b methodologies for modeling, inference and prediction \|c Kao-Tai Tsai
250			\|a First edition
264		1	\|a Boca Raton ; London ; New York \|b CRC Press \|c 2022
300			\|a XV, 244 Seiten \|b Diagramme \|c 24 cm
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
520			\|a "Machine Learning for Knowledge Discovery with R contains methodologies and examples for statistical modelling, inference, and prediction of data analysis. It includes many recent supervised and unsupervised machine learning methodologies such as recursive partitioning modelling, regularized regression, support vector machine, neural network, clustering, and causal-effect inference. Additionally, it emphasizes statistical thinking of data analysis, use of statistical graphs for data structure exploration, and result presentations. The book includes many real-world data examples from life-science, finance, etc. to illustrate the applications of the methods described therein"--
650		4	\|a Data mining / Methodology
650		4	\|a Machine learning
650		4	\|a R (Computer program language)
650		7	\|a Machine learning \|2 fast
650		7	\|a R (Computer program language) \|2 fast
650	0	7	\|a Maschinelles Lernen \|0 (DE-588)4193754-5 \|2 gnd \|9 rswk-swf
650	0	7	\|a R \|g Programm \|0 (DE-588)4705956-4 \|2 gnd \|9 rswk-swf
689	0	0	\|a Maschinelles Lernen \|0 (DE-588)4193754-5 \|D s
689	0	1	\|a R \|g Programm \|0 (DE-588)4705956-4 \|D s
689	0		\|5 DE-604
776	0	8	\|i Erscheint auch als \|n Online-Ausgabe \|z 978-1-003-20568-5
856	4	2	\|m Digitalisierung UB Passau - ADAM Catalogue Enrichment \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033042354&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
999			\|a oai:aleph.bib-bvb.de:BVB01-033042354

Datensatz im Suchindex

_version_	1804183126112468992
adam_text	Contents Preface xiii 1 Data Analysis 1.1 1.2 1.3 1.4 Perspectives of Data Analysis ................................................ Strategies and Stages of Data Analysis ................................. Data Quality.............................................................................. 1.3.1 Heterogeneity in Data Sources .................................... 1.3.1.1 Heterogeneity in Study SubjectPopulations 1.3.1.2 Heterogeneity in Data due to Timing of Gen erations ............................................................ 1.3.2 Noise Accumulation...................................................... 1.3.3 Spurious Correlation...................................................... 1.3.4 Missing Data.................................................................. Data Sets Analyzed in This Book .......................................... 1.4.1 NCI-60 ........................................................................... 1.4.2 Riboflavin Production with Bacillus Subtilis............... 1.4.3 TCGA.............................................................................. 1.4.4 The Boston Housing Data Set....................................... 2 Examining Data Distribution 2.1 One Dimension........................................................................... 2.1.1 Histogram, Stem-and-Leaf, Density Plot..................... 2.1.2 Box Plot ........................................................................ 2.1.3 Quantile-Quantile (Q-Q) Plot, Normal Plot, ProbabilityProbability (P-P) Plot................................................... 2.2 Two Dimension ........................................................................ 2.2.1 Scatter Plot..................................................................... 2.2.2 Ellipse ֊ Visualization of Covariance andCorrelation . 2.2.3 Multivariate Normality Test.......................................... 2.3 More Than Two Dimension...................................................... 2.3.1 Scatter Plot Matrix ...................................................... 2.3.2 Andrews s Plot............................................................... 2.3.3 Conditional Plot............................................................ 2.4 Visualization of Categorical Data .......................................... 2.4.1 Mosaic Plot..................................................................... 2.4.2 Association Plot............................................................ 1 1 3 4 5 5 5 6 6 6 7 7 7 7 8 9 9 9 10 11 12 12 13 17 19 19 20 23 25 26 27 vii viii Contents 3 Regressions 3.1 3.2 Ridge Regression........................................................................ Lasso ......................................................................................... 3.2.1 Example: Lasso on Continuous Data........................... 3.2.2 Example: Lasso on Binary Data ................................. 3.2.3 Example: Lasso on Survival Data................................. 3.3 Group Lasso ............................................................................. 3.3.1 Example: Group Lasso on Gene Signatures............... 3.4 Sparse Group Lasso ................................................................. 3.4.1 Example: Lasso, Group Lasso, Sparse Group Lasso on Simulated Continuous Data.......................................... 3.4.2 Example: Lasso, Group Lasso, Sparse Group Lasso on Gene Signatures ContinuousData............................... 3.5 Adaptive Lasso.......................................................................... 3.5.1 Example: Adaptive Lasso on Continuous Data .... 3.5.2 Example: Adaptive Lasso on Binary Data.................. 3.6 Elastic Net ................................................................................ 3.6.1 Example: Elastic Net on Continuous Data.................. 3.6.2 Example: Elastic Net on Binary Data........................ 3.7 The Sure Screening Method ................................................... 3.7.1 The Sure Screening Method.......................................... 3.7.2 Sure Independence Screening on Model Selection ... 3.7.3 Example: SIS on Continuous Data.............................. 3.7.4 Example: SIS on Survival Data.................................... 3.8 Identify Minimal Class of Models ....................................... 3.8.1 Analysis Using Minimal Models.................................... 4 RecursivePartitioning Modeling 4.1 Recursive Partitioning Modeling viaTrees ............................ 4.1.1 Elements of Growing a Tree.......................................... 4.1.1.1 Grow a Tree................................................... 4.1.2 The Impurity Function................................................... 4.1.2.1 Definition of Impurity Function ................... 4.1.2.2 Measure of Node Impurity - the Gini Index . 4.1.3 Misclassification Cost ................................................... 4.1.4 Size of Trees .................................................................. 4.1.5 Example of Recursive Partitioning.............................. 4.1.5.1 Recursive Partitioning with Binary Outcomes 4.1.5.2 Recursive Partitioning with Continuous Out comes ............................................................... 4.1.5.3 Recursive Partitioning for Survival Outcomes 4.2 Random Forest........................................................................... 4.2.1 Mechanism of Action of Random Forests..................... 4.2.2 Variable Importance...................................................... 4.2.3 Random Forests for Regression.................................... 29 29 30 31 32 33 34 35 37 38 41 45 46 47 49 51 52 53 54 55 56 56 57 58 59 59 59 60 60 61 61 61 62 63 63 65 67 70 72 72 73 Contents 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5 4.2.4 Example of Random Forest Data Analysis................... 4.2.4.1 randomForest for Binary Data..................... 4.2.4.2 randomForest for Continuous Data............... Random Survival Forest............................................................ 4.3.1 Algorithm to Construct RSF....................................... 4.3.2 Individual and Ensemble Estimate at Terminal Nodes 4.3.3 VIMP.............................................................................. 4.3.4 Example........................................................................... XGBoost: A Tree Boosting System.......................................... 4.4.1 Example Using xgboost for Data Analysis.................. 4.4.1.1 xgboost for Binary Data .............................. 4.4.1.2 xgboost for Continuous Data........................ 4.4.2 Example ֊ xgboost for Cox Regression........................ Model-based Recursive Partitioning ....................................... 4.5.1 The Recursive Partitioning Algorithm........................ 4.5.2 Example........................................................................... Recursive Partition for Longitudinal Data.............................. 4.6.1 Methodology.................................................................. 4.6.2 Recursive Partition for Longitudinal Data Based on Baseline Covariates......................................................... 4.6.2.1 Methodology................................................... 4.6.3 LongCART Algorithm................................................... 4.6.4 Example of Recursive Partitioning of Longitudinal Data Analysis of Ordinal Data ......................................................... Examples ֊ Analysis of Ordinal Data .................................... 4.8.1 Analysis of Cleveland Clinic Heart Data (Ordinal) . . 4.8.2 Analysis of Cleveland Clinic Heart Data (Twoing) . . Advantages and Disadvantages of Trees ................................. 1X 73 73 76 77 78 79 79 79 81 83 83 84 87 88 89 89 91 91 92 92 93 93 95 96 96 97 99 Support Vector Machine 101 5.1 101 102 102 103 103 104 104 104 105 106 106 108 109 110 110 5.2 General Theory of Classification and Regression in Hyperplane 5.1.1 Separable Case............................................................... 5.1.2 Non-separable Case ...................................................... 5.1.2.1 Method of Stochastic Approximation .... 5.1.2.2 Method of Sigmoid Approximations............ 5.1.2.3 Method of Radial Basis Functions............... SVM for Indicator Functions ................................................... 5.2.1 Optimal Hyperplane for Separable Data Sets............ 5.2.1.1 Constructing the Optimal Hyperplane .... 5.2.2 Optimal Hyperplane for Non-Separable Sets............... 5.2.2.1 Generalization of the Optimal Hyperplane . 5.2.3 Support Vector Machine................................................ 5.2.4 Constructing SVM......................................................... 5.2.4.1 Polynomial Kernel Functions........................ 5.2.4.2 Radial Basis Kernel Functions..................... x Contents 5.2.5 Example: Analysis of Binary Classification Using SVM 5.2.6 Example: Effect of Kernel Selection ............................ 5.3 SVM for Continuous Data ........................................................ 5.3.1 Minimizing the Risk with e-insensitive Loss Functions 5.3.2 Example: Regression Analysis Using SVM................... 5.4 SVM for Survival Data Analysis ............................................... 5.4.1 Example: Analysis of Survival Data Using SVM .... 5.5 Feature Elimination for SVM..................................................... 5.5.1 Example: Gene Selection via SVM with Feature Elimi nation ................................................................................. 5.6 Spare Bayesian Learning with Relevance Vector Machine (RVM) .......................................................................................... 5.6.1 Example: Regression Analysis Using RVM................... 5.6.2 Example: Curve Fitting for SVM and RVM................ 5.7 SV Machines for Function Estimation..................................... 110 112 112 113 115 117 118 119 120 122 125 125 127 6 Cluster Analysis 129 6.1 Measure of Distance/Dissimilarity ........................................... 129 6.1.1 Continuous Variables........................................................ 130 6.1.2 Binary and Categorical Variables.................................. 130 6.1.3 Mixed Data Types........................................................... 130 6.1.4 Other Measure of Dissimilarity..................................... 131 6.2 Hierarchical Clustering .............................................................. 131 6.2.1 Options of Linkage........................................................... 132 6.2.2 Example of Hierarchical Clustering............................... 133 6.3 К-means Cluster.......................................................................... 135 6.3.1 General Description of К-means Clustering................ 135 6.3.2 Estimating the Number of Clusters............................... 137 6.4 The PAM Clustering Algorithm ............................................... 139 6.4.1 Example of К-means with PAM Clustering Algorithm 141 6.5 Bagged Clustering....................................................................... 141 6.5.1 Example of Bagged Clustering..................................... 142 6.6 RandomForest for Clustering..................................................... 144 6.6.1 Example: Random Forest for Clustering...................... 144 6.7 Mixture Models/Model-based Cluster Analysis...................... 145 6.8 Stability of Clusters .................................................................... 147 6.9 Consensus Clustering ................................................................. 147 6.9.1 Determination of Clusters............................................... 148 6.9.2 Example of Consensus Clustering, on RNA Sequence Data.................................................................................... 149 6.10 The Integrative Clustering Framework..................................... 151 6.10.1 Example: Integrative Clustering..................................... 152 Contents 7 Neural Network χί 155 7.1 General Theoryof NeuralNetwork............................................ 155 7.2 Elemental Aspects and Structure ofArtificial Neural Networks 156 7.3 Multilayer Perceptrons ............................................................ 157 7.3.1 The Simple (Single Unit) Perceptron........................... 157 7.3.2 Training Perceptron Learning....................................... 157 7.4 Multilayer Perceptrons (MLP) ................................................ 158 7.4.1 Architectures of MLP................................................... 158 7.4.2 Training MLP ............................................................... 159 7.5 Deep Learning ........................................................................... 159 7.5.1 Model Parameterization................................................ 160 7.6 Few Pros and Cons of Neural Networks ................................. 161 7.7 Examples.................................................................................... 162 8 Causal InferenceandMatching 173 8.1 8.2 8.3 8.4 8.5 8.6 173 173 174 176 177 178 178 178 180 180 181 181 182 183 184 185 186 187 188 189 191 192 Introduction .............................................................................. Three Layer Causal Hierarchy ................................................ Seven Tools of Causal Inference ............................................. Statistical Framework of Causal Inferences ........................... Propensity Score........................................................................ Methodologies of Matching ...................................................... 8.6.1 Nearest Neighbor (or greedy) Matching..................... 8.6.1.1 Example Using Nearest Neighbor Matching . 8.6.2 Exact Matching............................................................... 8.6.2.1 Example.......................................................... 8.6.3 Mahalanobis Distance Matching ................................. 8.6.3.1 Example.......................................................... 8.6.4 Genetic Matching ......................................................... 8.6.4.1 Example.......................................................... 8.7 Optimal Matching..................................................................... 8.7.0.1 Example.......................................................... 8.8 Full Matching ........................................................................... 8.8.0.1 Example.......................................................... 8.8.1 Analysis of Data After Matching.................................. 8.8.1.1 Example.......................................................... 8.9 Cluster Matching ..................................................................... 8.9.1 Example................................... 9 Business 9.1 Case Study One: Marketing Campaigns of a Portuguese Bank ing Institution ........................................................................... 9.1.1 Description of Data ...................................................... 9.1.2 Data Analysis.................................................................. 9.1.2.1 Analysis via Lasso........................................... 9.1.2.2 Analysis via Elastic Net.................................. 197 197 197 198 198 198 Contents xii 9.2 9.3 9.4 9.1.2.3 Analysis via SIS............................................. 9.1.2.4 Analysis via rpart.......................................... 9.1.2.5 Analysis via randomForest........................... 9.1.2.6 Analysis via xgboost....................................... Summary................................................................................... Case Study Two: Polish Companies Bankruptcy Data .... 9.3.1 Description of Data ...................................................... 9.3.2 Data Analysis.................................................................. 9.3.2.1 Analysis of Year-1 Data (univariate analysis) 9.3.2.2 Analysis of Year-3 Data (univariate analysis) 9.3.2.3 Analysis of Year-5 Data (univariate analysis) 9.3.2.4 Analysis of Year-1 Data (composite analysis) 9.3.2.5 Analysis of Year-3 Data (composite analysis) 9.3.2.6 Analysis of Year-5 Data (composite analysis) Summary.................................................................................... 10 Analysis of Response Profiles 199 200 200 202 203 204 204 206 207 209 210 212 214 216 218 221 10.1 10.2 10.3 10.4 Introduction .............................................................................. 221 Data Example ........................................................................... 221 Transition of Response States ................................................ 224 Classification of Response Profiles .......................................... 225 10.4.1 Dissimilarities Between Response Profiles............ 225 10.4.2 Visualizing Clusters via Multidimensional Scaling . . . 226 10.4.3 Response Profile Differences among Clusters......... 227 10.4.4 Significant Clinical Variables for Each Cluster...... 228 10.5 Modeling of Response Profiles via GEE ................................. 230 10.5.1 Marginal Models...................................................... 230 10.5.2 Estimation of Marginal Regression Parameters.... 231 10.5.3 Local Odds Ratio...................................................... 231 10.5.4 Results of Modeling................................................ 231 10.6 Summary.................................................................................... 233 Bibliography 235 Index 243
adam_txt	Contents Preface xiii 1 Data Analysis 1.1 1.2 1.3 1.4 Perspectives of Data Analysis . Strategies and Stages of Data Analysis . Data Quality. 1.3.1 Heterogeneity in Data Sources . 1.3.1.1 Heterogeneity in Study SubjectPopulations 1.3.1.2 Heterogeneity in Data due to Timing of Gen erations . 1.3.2 Noise Accumulation. 1.3.3 Spurious Correlation. 1.3.4 Missing Data. Data Sets Analyzed in This Book . 1.4.1 NCI-60 . 1.4.2 Riboflavin Production with Bacillus Subtilis. 1.4.3 TCGA. 1.4.4 The Boston Housing Data Set. 2 Examining Data Distribution 2.1 One Dimension. 2.1.1 Histogram, Stem-and-Leaf, Density Plot. 2.1.2 Box Plot . 2.1.3 Quantile-Quantile (Q-Q) Plot, Normal Plot, ProbabilityProbability (P-P) Plot. 2.2 Two Dimension . 2.2.1 Scatter Plot. 2.2.2 Ellipse ֊ Visualization of Covariance andCorrelation . 2.2.3 Multivariate Normality Test. 2.3 More Than Two Dimension. 2.3.1 Scatter Plot Matrix . 2.3.2 Andrews's Plot. 2.3.3 Conditional Plot. 2.4 Visualization of Categorical Data . 2.4.1 Mosaic Plot. 2.4.2 Association Plot. 1 1 3 4 5 5 5 6 6 6 7 7 7 7 8 9 9 9 10 11 12 12 13 17 19 19 20 23 25 26 27 vii viii Contents 3 Regressions 3.1 3.2 Ridge Regression. Lasso . 3.2.1 Example: Lasso on Continuous Data. 3.2.2 Example: Lasso on Binary Data . 3.2.3 Example: Lasso on Survival Data. 3.3 Group Lasso . 3.3.1 Example: Group Lasso on Gene Signatures. 3.4 Sparse Group Lasso . 3.4.1 Example: Lasso, Group Lasso, Sparse Group Lasso on Simulated Continuous Data. 3.4.2 Example: Lasso, Group Lasso, Sparse Group Lasso on Gene Signatures ContinuousData. 3.5 Adaptive Lasso. 3.5.1 Example: Adaptive Lasso on Continuous Data . 3.5.2 Example: Adaptive Lasso on Binary Data. 3.6 Elastic Net . 3.6.1 Example: Elastic Net on Continuous Data. 3.6.2 Example: Elastic Net on Binary Data. 3.7 The Sure Screening Method . 3.7.1 The Sure Screening Method. 3.7.2 Sure Independence Screening on Model Selection . 3.7.3 Example: SIS on Continuous Data. 3.7.4 Example: SIS on Survival Data. 3.8 Identify Minimal Class of Models . 3.8.1 Analysis Using Minimal Models. 4 RecursivePartitioning Modeling 4.1 Recursive Partitioning Modeling viaTrees . 4.1.1 Elements of Growing a Tree. 4.1.1.1 Grow a Tree. 4.1.2 The Impurity Function. 4.1.2.1 Definition of Impurity Function . 4.1.2.2 Measure of Node Impurity - the Gini Index . 4.1.3 Misclassification Cost . 4.1.4 Size of Trees . 4.1.5 Example of Recursive Partitioning. 4.1.5.1 Recursive Partitioning with Binary Outcomes 4.1.5.2 Recursive Partitioning with Continuous Out comes . 4.1.5.3 Recursive Partitioning for Survival Outcomes 4.2 Random Forest. 4.2.1 Mechanism of Action of Random Forests. 4.2.2 Variable Importance. 4.2.3 Random Forests for Regression. 29 29 30 31 32 33 34 35 37 38 41 45 46 47 49 51 52 53 54 55 56 56 57 58 59 59 59 60 60 61 61 61 62 63 63 65 67 70 72 72 73 Contents 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5 4.2.4 Example of Random Forest Data Analysis. 4.2.4.1 randomForest for Binary Data. 4.2.4.2 randomForest for Continuous Data. Random Survival Forest. 4.3.1 Algorithm to Construct RSF. 4.3.2 Individual and Ensemble Estimate at Terminal Nodes 4.3.3 VIMP. 4.3.4 Example. XGBoost: A Tree Boosting System. 4.4.1 Example Using xgboost for Data Analysis. 4.4.1.1 xgboost for Binary Data . 4.4.1.2 xgboost for Continuous Data. 4.4.2 Example ֊ xgboost for Cox Regression. Model-based Recursive Partitioning . 4.5.1 The Recursive Partitioning Algorithm. 4.5.2 Example. Recursive Partition for Longitudinal Data. 4.6.1 Methodology. 4.6.2 Recursive Partition for Longitudinal Data Based on Baseline Covariates. 4.6.2.1 Methodology. 4.6.3 LongCART Algorithm. 4.6.4 Example of Recursive Partitioning of Longitudinal Data Analysis of Ordinal Data . Examples ֊ Analysis of Ordinal Data . 4.8.1 Analysis of Cleveland Clinic Heart Data (Ordinal) . . 4.8.2 Analysis of Cleveland Clinic Heart Data (Twoing) . . Advantages and Disadvantages of Trees . 1X 73 73 76 77 78 79 79 79 81 83 83 84 87 88 89 89 91 91 92 92 93 93 95 96 96 97 99 Support Vector Machine 101 5.1 101 102 102 103 103 104 104 104 105 106 106 108 109 110 110 5.2 General Theory of Classification and Regression in Hyperplane 5.1.1 Separable Case. 5.1.2 Non-separable Case . 5.1.2.1 Method of Stochastic Approximation . 5.1.2.2 Method of Sigmoid Approximations. 5.1.2.3 Method of Radial Basis Functions. SVM for Indicator Functions . 5.2.1 Optimal Hyperplane for Separable Data Sets. 5.2.1.1 Constructing the Optimal Hyperplane . 5.2.2 Optimal Hyperplane for Non-Separable Sets. 5.2.2.1 Generalization of the Optimal Hyperplane . 5.2.3 Support Vector Machine. 5.2.4 Constructing SVM. 5.2.4.1 Polynomial Kernel Functions. 5.2.4.2 Radial Basis Kernel Functions. x Contents 5.2.5 Example: Analysis of Binary Classification Using SVM 5.2.6 Example: Effect of Kernel Selection . 5.3 SVM for Continuous Data . 5.3.1 Minimizing the Risk with e-insensitive Loss Functions 5.3.2 Example: Regression Analysis Using SVM. 5.4 SVM for Survival Data Analysis . 5.4.1 Example: Analysis of Survival Data Using SVM . 5.5 Feature Elimination for SVM. 5.5.1 Example: Gene Selection via SVM with Feature Elimi nation . 5.6 Spare Bayesian Learning with Relevance Vector Machine (RVM) . 5.6.1 Example: Regression Analysis Using RVM. 5.6.2 Example: Curve Fitting for SVM and RVM. 5.7 SV Machines for Function Estimation. 110 112 112 113 115 117 118 119 120 122 125 125 127 6 Cluster Analysis 129 6.1 Measure of Distance/Dissimilarity . 129 6.1.1 Continuous Variables. 130 6.1.2 Binary and Categorical Variables. 130 6.1.3 Mixed Data Types. 130 6.1.4 Other Measure of Dissimilarity. 131 6.2 Hierarchical Clustering . 131 6.2.1 Options of Linkage. 132 6.2.2 Example of Hierarchical Clustering. 133 6.3 К-means Cluster. 135 6.3.1 General Description of К-means Clustering. 135 6.3.2 Estimating the Number of Clusters. 137 6.4 The PAM Clustering Algorithm . 139 6.4.1 Example of К-means with PAM Clustering Algorithm 141 6.5 Bagged Clustering. 141 6.5.1 Example of Bagged Clustering. 142 6.6 RandomForest for Clustering. 144 6.6.1 Example: Random Forest for Clustering. 144 6.7 Mixture Models/Model-based Cluster Analysis. 145 6.8 Stability of Clusters . 147 6.9 Consensus Clustering . 147 6.9.1 Determination of Clusters. 148 6.9.2 Example of Consensus Clustering, on RNA Sequence Data. 149 6.10 The Integrative Clustering Framework. 151 6.10.1 Example: Integrative Clustering. 152 Contents 7 Neural Network χί 155 7.1 General Theoryof NeuralNetwork. 155 7.2 Elemental Aspects and Structure ofArtificial Neural Networks 156 7.3 Multilayer Perceptrons . 157 7.3.1 The Simple (Single Unit) Perceptron. 157 7.3.2 Training Perceptron Learning. 157 7.4 Multilayer Perceptrons (MLP) . 158 7.4.1 Architectures of MLP. 158 7.4.2 Training MLP . 159 7.5 Deep Learning . 159 7.5.1 Model Parameterization. 160 7.6 Few Pros and Cons of Neural Networks . 161 7.7 Examples. 162 8 Causal InferenceandMatching 173 8.1 8.2 8.3 8.4 8.5 8.6 173 173 174 176 177 178 178 178 180 180 181 181 182 183 184 185 186 187 188 189 191 192 Introduction . Three Layer Causal Hierarchy . Seven Tools of Causal Inference . Statistical Framework of Causal Inferences . Propensity Score. Methodologies of Matching . 8.6.1 Nearest Neighbor (or greedy) Matching. 8.6.1.1 Example Using Nearest Neighbor Matching . 8.6.2 Exact Matching. 8.6.2.1 Example. 8.6.3 Mahalanobis Distance Matching . 8.6.3.1 Example. 8.6.4 Genetic Matching . 8.6.4.1 Example. 8.7 Optimal Matching. 8.7.0.1 Example. 8.8 Full Matching . 8.8.0.1 Example. 8.8.1 Analysis of Data After Matching. 8.8.1.1 Example. 8.9 Cluster Matching . 8.9.1 Example. 9 Business 9.1 Case Study One: Marketing Campaigns of a Portuguese Bank ing Institution . 9.1.1 Description of Data . 9.1.2 Data Analysis. 9.1.2.1 Analysis via Lasso. 9.1.2.2 Analysis via Elastic Net. 197 197 197 198 198 198 Contents xii 9.2 9.3 9.4 9.1.2.3 Analysis via SIS. 9.1.2.4 Analysis via rpart. 9.1.2.5 Analysis via randomForest. 9.1.2.6 Analysis via xgboost. Summary. Case Study Two: Polish Companies Bankruptcy Data . 9.3.1 Description of Data . 9.3.2 Data Analysis. 9.3.2.1 Analysis of Year-1 Data (univariate analysis) 9.3.2.2 Analysis of Year-3 Data (univariate analysis) 9.3.2.3 Analysis of Year-5 Data (univariate analysis) 9.3.2.4 Analysis of Year-1 Data (composite analysis) 9.3.2.5 Analysis of Year-3 Data (composite analysis) 9.3.2.6 Analysis of Year-5 Data (composite analysis) Summary. 10 Analysis of Response Profiles 199 200 200 202 203 204 204 206 207 209 210 212 214 216 218 221 10.1 10.2 10.3 10.4 Introduction . 221 Data Example . 221 Transition of Response States . 224 Classification of Response Profiles . 225 10.4.1 Dissimilarities Between Response Profiles. 225 10.4.2 Visualizing Clusters via Multidimensional Scaling . . . 226 10.4.3 Response Profile Differences among Clusters. 227 10.4.4 Significant Clinical Variables for Each Cluster. 228 10.5 Modeling of Response Profiles via GEE . 230 10.5.1 Marginal Models. 230 10.5.2 Estimation of Marginal Regression Parameters. 231 10.5.3 Local Odds Ratio. 231 10.5.4 Results of Modeling. 231 10.6 Summary. 233 Bibliography 235 Index 243
any_adam_object	1
any_adam_object_boolean	1
author	Tsai, Kao-Tai
author_GND	(DE-588)1245968041
author_facet	Tsai, Kao-Tai
author_role	aut
author_sort	Tsai, Kao-Tai
author_variant	k t t ktt
building	Verbundindex
bvnumber	BV047657454
classification_rvk	ST 530
ctrlnum	(OCoLC)1296280482 (DE-599)BVBBV047657454
discipline	Informatik
discipline_str_mv	Informatik
edition	First edition
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02509nam a2200457 c 4500</leader><controlfield tag="001">BV047657454</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20231201 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">220103s2022 \|\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">GBC1B4055</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781032065366</subfield><subfield code="c">hbk</subfield><subfield code="9">978-1-032-06536-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781032071596</subfield><subfield code="c">pbk</subfield><subfield code="9">978-1-032-07159-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1296280482</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047657454</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Tsai, Kao-Tai</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1245968041</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Machine learning for knowledge discovery with R</subfield><subfield code="b">methodologies for modeling, inference and prediction</subfield><subfield code="c">Kao-Tai Tsai</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boca Raton ; London ; New York</subfield><subfield code="b">CRC Press</subfield><subfield code="c">2022</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XV, 244 Seiten</subfield><subfield code="b">Diagramme</subfield><subfield code="c">24 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">"Machine Learning for Knowledge Discovery with R contains methodologies and examples for statistical modelling, inference, and prediction of data analysis. It includes many recent supervised and unsupervised machine learning methodologies such as recursive partitioning modelling, regularized regression, support vector machine, neural network, clustering, and causal-effect inference. Additionally, it emphasizes statistical thinking of data analysis, use of statistical graphs for data structure exploration, and result presentations. The book includes many real-world data examples from life-science, finance, etc. to illustrate the applications of the methods described therein"--</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data mining / Methodology</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">R (Computer program language)</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Machine learning</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">R (Computer program language)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-003-20568-5</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033042354&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-033042354</subfield></datafield></record></collection>
id	DE-604.BV047657454
illustrated	Not Illustrated
index_date	2024-07-03T18:51:31Z
indexdate	2024-07-10T09:18:29Z
institution	BVB
isbn	9781032065366 9781032071596
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-033042354
oclc_num	1296280482
open_access_boolean
owner	DE-739
owner_facet	DE-739
physical	XV, 244 Seiten Diagramme 24 cm
publishDate	2022
publishDateSearch	2022
publishDateSort	2022
publisher	CRC Press
record_format	marc
spelling	Tsai, Kao-Tai Verfasser (DE-588)1245968041 aut Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction Kao-Tai Tsai First edition Boca Raton ; London ; New York CRC Press 2022 XV, 244 Seiten Diagramme 24 cm txt rdacontent n rdamedia nc rdacarrier "Machine Learning for Knowledge Discovery with R contains methodologies and examples for statistical modelling, inference, and prediction of data analysis. It includes many recent supervised and unsupervised machine learning methodologies such as recursive partitioning modelling, regularized regression, support vector machine, neural network, clustering, and causal-effect inference. Additionally, it emphasizes statistical thinking of data analysis, use of statistical graphs for data structure exploration, and result presentations. The book includes many real-world data examples from life-science, finance, etc. to illustrate the applications of the methods described therein"-- Data mining / Methodology Machine learning R (Computer program language) Machine learning fast R (Computer program language) fast Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf R Programm (DE-588)4705956-4 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 s R Programm (DE-588)4705956-4 s DE-604 Erscheint auch als Online-Ausgabe 978-1-003-20568-5 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033042354&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis
spellingShingle	Tsai, Kao-Tai Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction Data mining / Methodology Machine learning R (Computer program language) Machine learning fast R (Computer program language) fast Maschinelles Lernen (DE-588)4193754-5 gnd R Programm (DE-588)4705956-4 gnd
subject_GND	(DE-588)4193754-5 (DE-588)4705956-4
title	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction
title_auth	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction
title_exact_search	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction
title_exact_search_txtP	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction
title_full	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction Kao-Tai Tsai
title_fullStr	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction Kao-Tai Tsai
title_full_unstemmed	Machine learning for knowledge discovery with R methodologies for modeling, inference and prediction Kao-Tai Tsai
title_short	Machine learning for knowledge discovery with R
title_sort	machine learning for knowledge discovery with r methodologies for modeling inference and prediction
title_sub	methodologies for modeling, inference and prediction
topic	Data mining / Methodology Machine learning R (Computer program language) Machine learning fast R (Computer program language) fast Maschinelles Lernen (DE-588)4193754-5 gnd R Programm (DE-588)4705956-4 gnd
topic_facet	Data mining / Methodology Machine learning R (Computer program language) Maschinelles Lernen R Programm
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033042354&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT tsaikaotai machinelearningforknowledgediscoverywithrmethodologiesformodelinginferenceandprediction

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Inhaltsverzeichnis

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge