Statistics and machine learning methods for EHR data: from data extraction to data analytics
Gespeichert in:
Weitere Verfasser: | , , , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Boca Raton ; London ; New York
CRC Press
2021
|
Ausgabe: | First edition |
Schriftenreihe: | Chapman & Hall/CRC Healthcare informatics series
|
Online-Zugang: | TUM01 |
Beschreibung: | Description based on publisher supplied metadata and other sources |
Beschreibung: | 1 Online-Ressource Illustrationen, Diagramme |
ISBN: | 9781000260946 9781003030003 9781000260960 9781000260953 |
Internformat
MARC
LEADER | 00000nmm a2200000zc 4500 | ||
---|---|---|---|
001 | BV047688347 | ||
003 | DE-604 | ||
005 | 20220429 | ||
007 | cr|uuu---uuuuu | ||
008 | 220118s2021 |||| o||u| ||||||eng d | ||
020 | |a 9781000260946 |9 978-1-00-026094-6 | ||
020 | |a 9781003030003 |9 978-1-003-03000-3 | ||
020 | |a 9781000260960 |9 978-1-00-026096-0 | ||
020 | |a 9781000260953 |9 978-1-00-026095-3 | ||
035 | |a (ZDB-30-PQE)EBC6378532 | ||
035 | |a (ZDB-30-PAD)EBC6378532 | ||
035 | |a (ZDB-89-EBL)EBL6378532 | ||
035 | |a (OCoLC)1314898362 | ||
035 | |a (DE-599)BVBBV047688347 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-91 | ||
245 | 1 | 0 | |a Statistics and machine learning methods for EHR data |b from data extraction to data analytics |c edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy |
250 | |a First edition | ||
264 | 1 | |a Boca Raton ; London ; New York |b CRC Press |c 2021 | |
264 | 4 | |c © 2021 | |
300 | |a 1 Online-Ressource |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 0 | |a Chapman & Hall/CRC Healthcare informatics series | |
500 | |a Description based on publisher supplied metadata and other sources | ||
505 | 8 | |a Intro -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- About the Editors -- Contributors -- 1. Introduction: Use of EHR Data for Scientific Discoveries--Challenges and Opportunities -- 1.1. Real-World Data and Real-World Evidence: Big Data in Practice -- 1.2. Use of EMR/EHR Database for Research and Scientific Discoveries: Procedure and Life Cycle -- 1.2.1. Initiate a Project -- 1.2.2. Data Queries and Data Extraction -- 1.2.3. Data Cleaning -- 1.2.4. Data Pre-Processing or Processing -- 1.2.5. Data Preparation -- 1.2.6. Data Analysis, Modeling and Prediction -- 1.2.7. Result Validation -- 1.2.8. Result Interpretation -- 1.2.9. Publication and Dissemination -- 1.3. Challenges and Opportunities -- References -- 2. EHR Project Management -- 2.1. Introduction -- 2.1.1. What is Project Management? -- 2.1.2. Why We Need Project Management? -- 2.1.3. Project Management Goals and Principles -- 2.2. Project and Sub-Project in EHR Research -- 2.3. Data, Code and Product Management -- 2.3.1. Data Loss Prevention -- 2.3.2. Naming Conventions -- 2.3.3. Version Control -- 2.3.4. Coding Convention -- Object-Oriented or Non-Object-Oriented Programming -- 2.3.5. Document Management: Data Analysis Report, Papers and Read-Me Documents -- 2.4. Team/People Management -- 2.4.1. How to Form a Team: What Expertise is Needed for EHR Projects? -- 2.4.2. How to Efficiently Manage a Multidisciplinary Team? -- 2.4.3. Task Management -- 2.5. Management Methods and Software Tools -- 2.6. An Example of a Data Management Framework -- 2.6.1. Folder Management -- Naming -- Structure -- Main Folders -- CBD_HS -- Public_Folder -- Admin -- Useful_Info -- Group Folders -- Project Folders -- Sub_Project Folders -- 2.6.2. File Management -- Naming -- Structure -- File Submission -- 2.6.3. User Management -- User Groups | |
505 | 8 | |a 2.6.4. Data Management Framework -- 2.7. Discussion and Summary -- 2.8. Appendix--File Submission Form -- Note -- References -- 3. EHR Databases and Data Management: Data Query and Extraction -- 3.1. Introduction -- 3.2. EHR/EMR Database Availability and Access -- 3.3. EHR/EMR Database Design and Structure: Database Queries -- 3.3.1. Database Construction -- 3.3.2. Traditional Relational Database System -- 3.3.3. Distributed Database System -- 3.4. Data Extraction -- 3.4.1. Define Inclusion/Exclusion Criteria for Data Extraction -- 3.4.2. Phenotyping: Cohort Identification -- 3.5. Data Extraction Report -- 3.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 3.6.1. EHR Database Design and Construction -- 3.6.2. SAH Cohort Identification and Data Extraction -- 3.6.3. Data Extraction Report -- 3.6.4. Potential Data Extraction Pitfalls and Errors with Solutions -- References -- 4. EHR Data Cleaning -- 4.1. Introduction -- 4.2. Review of Current Data Cleaning Methods and Tools -- 4.2.1. Data Wranglers -- 4.2.2. Data Cleaning Tools for Specific EHR Datasets -- 4.2.3. Data Quality Assessment -- 4.3. Common EHR Data Errors and Fixing Methods -- 4.3.1. List of Common Errors in an EHR Database -- 4.3.2. Demographics Table -- Multiple Race and Gender -- Multiple Patient Keys for the Same Encounter ID -- Multiple Calculated Birth Date -- 4.3.3. Lab Table -- Developing Conversion Map -- Conversion Map ID -- Convert To -- Conversion Equation -- The Lower Limit and Upper Limit -- Lab Date and Time -- User Input Form and Report Generator -- Output -- 4.3.4. Clinical Event Table -- Variable Combining -- Information Recovery -- A Case Study -- Overlap of Different Tables -- Correction of Misinformation -- 4.3.5. Diagnosis and Medication Table -- 4.3.6. Procedure Table -- Introduction to the Procedure Code Data -- Procedure Table Data Cleaning | |
505 | 8 | |a 4.4. Discussion -- Acknowledgments -- Notes -- References -- 5. EHR Data Pre-Processing and Preparation -- 5.1. Introduction -- 5.1.1. Definition of Data Pre-Processing/Processing -- 5.1.2. Definition of Data Preparation -- 5.2. Data Pre-Processing -- 5.2.1. Tidy Data Principles -- Variable Encoding -- 5.2.2. Feature Extraction: Derived Variables -- 5.2.3. Dimension Reduction -- Variable Grouping or Clustering -- Principle Component Analysis (PCA) -- Embedding and Deep Learning -- 5.2.4. Missing Data Imputation -- 5.3. Data Preparation -- 5.3.1. Define the Endpoint or Outcome -- 5.3.2. Process Medical Record Timestamps -- 5.3.3. Define the Encounter Time Interval -- 5.3.4. Encounter Combination -- 5.3.5. Define Comparison Groups -- 5.3.6. Cohort Refining -- 5.3.7. Leakage Detection -- 5.3.8. Data Preparation for Different Analysis Purposes -- 5.4. Data Processing/Preparation Errors and Pitfalls with Solutions -- 5.5. Data Pre-Processing and Preparation Report -- 5.6. Summary -- References -- 6. Missing Data Issues in EHR -- 6.1. Introduction and Overview -- 6.2. Missing Data Mechanisms -- 6.3. Methods for Incomplete EHR Data -- 6.3.1. Naïve Method -- 6.3.2. Imputation Using Statistical Models -- 6.3.3. Machine Learning and Deep Learning Models -- 6.3.4. Choice of Best Method for EHR Data -- 6.4. Case Study -- 6.4.1. Missing Condition in EHR Data -- 6.4.2. Missing Imputation in EHR Datasets -- 6.4.3. Evaluating the Performance of Imputation Methods and Thresholds -- 6.5. Discussion and Conclusion -- References -- 7. Causal Inference and Analysis for EHR Data -- 7.1. Introduction -- 7.1.1. Why Causal Inference -- 7.1.2. Overview of Causal Inference Methods: Rubin Causal Model (RCM) -- 7.1.3. Basic Framework in Causality: Potential Outcome Framework -- Average and Individual Treatment Effects -- 7.2. Propensity Scoring -- 7.2.1. Brief Introduction | |
505 | 8 | |a 7.2.2. Propensity Scoring for Binary Treatments -- 7.2.3. Propensity Scoring for Multiple Treatments -- 7.2.4. Propensity Scoring for Ordinal Treatments -- 7.2.5. Propensity Score Estimation for Complex Data Sets -- 7.2.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 7.3. Mediation Analysis -- 7.3.1. Introduction to Mediation Analysis -- 7.3.2. The Product Method -- 7.3.3. The Difference Method -- 7.3.4. Other Considerations -- 7.4. Instrumental Variables Networks for Treatment Effect Estimation in the Presence of Unmeasured Confounders -- 7.4.1. Instrumental Variables Frameworks -- 7.4.2. Two-Stage Least Square Methods with Linear Models -- Simple Linear Models -- Covariance Analysis -- Generalized Least Square Estimator -- Two-Stage Least Square Method -- Nonlinear Models for Two-Stage Least Squares Approach -- 7.5. Learning Treatment Effect by Generative Adversarial Networks -- 7.5.1. Introduction -- 7.5.2. CGANs as a General Framework for Estimation of Individualized Treatment Effects -- The Architecture of CGANs for Generating Potential Outcomes -- CGANs for Estimating ITEs -- CGANs for Estimating ITEs in Survival Analysis -- 7.5.3. Wasserstein GANs for Estimation of Individualized Treatment Effects -- 7.5.4. MisCGANs for Estimation of Individualized Treatment Effects -- The General Process for Incompletely Observed Data -- MisGAN for Counterfactual Imputation -- 7.5.5. Optimal Treatment Selection -- Sparse Techniques for Biomarker Identification -- Biomarker Identification for Optimal Treatment Selection -- 7.6. Deconfounder in Estimation of Treatment Effects -- 7.6.1. Introduction -- 7.6.2. Causal Models with Latent Confounders -- 7.6.3. Adversarial Learning Confounders -- 7.6.4. Loss Function and Optimization for Estimating ITEs in the Presence of Confounders -- 7.7. Targeted Maximum Likelihood Estimation | |
505 | 8 | |a 7.8. Supplementary Note A -- 7.8.1. Wasserstein GAN -- A1 Different Distances -- A1.1 Maximum Likelihood Estimation -- A1.2 Total Variation (TV) Distance -- A1.3 The Kullback-Leibler (KL) Divergence -- A1.4 The Jenson-Shannon (JS) Divergence -- A1.5 Earth Mover (EM) or Wasserstein Distance -- A2 Wasserstein GAN -- A3 Algorithm (WGAN) -- References -- 8. EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods -- 8.1. Introduction -- 8.1.1. Statistical Challenges for EHR Data -- 8.1.2. Overview of Existing Methods -- 8.2. Data Exploration and Visualization -- 8.3. Statistical Models for EHR Data -- 8.3.1. Contingency Tables -- 8.3.2. Chi-Square Test -- 8.3.3. Hypergeometric Test -- 8.4. GLM -- 8.5. Survival Model -- 8.6. Mixed-Effect Models -- 8.7. Time Series Analysis -- 8.7.1. AR, MA and ARMA Model -- 8.7.2. Gaussian Process -- 8.8. Variable Selection Methods -- 8.8.1. Stepwise Variable Selection -- 8.8.2. Purposeful Variable Selection -- 8.8.3. SIS -- 8.8.4. Penalty-Based Methods -- 8.9. Divide-and-Conquer Method -- 8.10. Validation -- 8.11. Results and Examples -- 8.12. Discussions and Conclusions -- References -- 9. Neural Network and Deep Learning Methods for EHR Data -- 9.1. Introduction -- 9.2. Deep Learning Methods for EHR Data -- 9.3. Deep Learning Software Tools and Implementation -- 9.4. Application Examples -- Case Study 1: Application of MLP for Mortality Prediction -- Case Study 2: Application of RNN for Heart Failure Prediction for Hypertension Patients -- Experimental Setting -- RNN Prediction Results -- 9.5. Discussion -- References -- 10. EHR Data Analytics and Predictions: Machine Learning Methods -- 10.1. Machine Learning Overview -- 10.2. Machine Learning Methods -- Random Forest -- Extremely Randomized Tree -- Gradient Boosting -- XgBoost -- Support Vector Machine (SVM) | |
505 | 8 | |a 10.3. Machine Learning Software Tools | |
700 | 1 | |a Wu, Hulin |4 edt | |
700 | 1 | |a Yamal, Jose-Miguel |4 edt | |
700 | 1 | |a Yaseen, Ashraf |4 edt | |
700 | 1 | |a Maroufy, Vahed |4 edt | |
776 | 0 | 8 | |i Erscheint auch als |a Wu, Hulin |t Statistics and Machine Learning Methods for EHR Data |d Milton : CRC Press LLC,c2020 |n Druck-Ausgabe, Hardcover |z 978-0-367-44239-2 |
912 | |a ZDB-30-PQE |a ZDB-4-NLEBK | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-033072363 | ||
966 | e | |u https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2659670 |l TUM01 |p ZDB-4-NLEBK |q TUM_PDA_EBSCOMED_Kauf |x Aggregator |3 Volltext |
Datensatz im Suchindex
_version_ | 1804183173962137600 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author2 | Wu, Hulin Yamal, Jose-Miguel Yaseen, Ashraf Maroufy, Vahed |
author2_role | edt edt edt edt |
author2_variant | h w hw j m y jmy a y ay v m vm |
author_facet | Wu, Hulin Yamal, Jose-Miguel Yaseen, Ashraf Maroufy, Vahed |
building | Verbundindex |
bvnumber | BV047688347 |
collection | ZDB-30-PQE ZDB-4-NLEBK |
contents | Intro -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- About the Editors -- Contributors -- 1. Introduction: Use of EHR Data for Scientific Discoveries--Challenges and Opportunities -- 1.1. Real-World Data and Real-World Evidence: Big Data in Practice -- 1.2. Use of EMR/EHR Database for Research and Scientific Discoveries: Procedure and Life Cycle -- 1.2.1. Initiate a Project -- 1.2.2. Data Queries and Data Extraction -- 1.2.3. Data Cleaning -- 1.2.4. Data Pre-Processing or Processing -- 1.2.5. Data Preparation -- 1.2.6. Data Analysis, Modeling and Prediction -- 1.2.7. Result Validation -- 1.2.8. Result Interpretation -- 1.2.9. Publication and Dissemination -- 1.3. Challenges and Opportunities -- References -- 2. EHR Project Management -- 2.1. Introduction -- 2.1.1. What is Project Management? -- 2.1.2. Why We Need Project Management? -- 2.1.3. Project Management Goals and Principles -- 2.2. Project and Sub-Project in EHR Research -- 2.3. Data, Code and Product Management -- 2.3.1. Data Loss Prevention -- 2.3.2. Naming Conventions -- 2.3.3. Version Control -- 2.3.4. Coding Convention -- Object-Oriented or Non-Object-Oriented Programming -- 2.3.5. Document Management: Data Analysis Report, Papers and Read-Me Documents -- 2.4. Team/People Management -- 2.4.1. How to Form a Team: What Expertise is Needed for EHR Projects? -- 2.4.2. How to Efficiently Manage a Multidisciplinary Team? -- 2.4.3. Task Management -- 2.5. Management Methods and Software Tools -- 2.6. An Example of a Data Management Framework -- 2.6.1. Folder Management -- Naming -- Structure -- Main Folders -- CBD_HS -- Public_Folder -- Admin -- Useful_Info -- Group Folders -- Project Folders -- Sub_Project Folders -- 2.6.2. File Management -- Naming -- Structure -- File Submission -- 2.6.3. User Management -- User Groups 2.6.4. Data Management Framework -- 2.7. Discussion and Summary -- 2.8. Appendix--File Submission Form -- Note -- References -- 3. EHR Databases and Data Management: Data Query and Extraction -- 3.1. Introduction -- 3.2. EHR/EMR Database Availability and Access -- 3.3. EHR/EMR Database Design and Structure: Database Queries -- 3.3.1. Database Construction -- 3.3.2. Traditional Relational Database System -- 3.3.3. Distributed Database System -- 3.4. Data Extraction -- 3.4.1. Define Inclusion/Exclusion Criteria for Data Extraction -- 3.4.2. Phenotyping: Cohort Identification -- 3.5. Data Extraction Report -- 3.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 3.6.1. EHR Database Design and Construction -- 3.6.2. SAH Cohort Identification and Data Extraction -- 3.6.3. Data Extraction Report -- 3.6.4. Potential Data Extraction Pitfalls and Errors with Solutions -- References -- 4. EHR Data Cleaning -- 4.1. Introduction -- 4.2. Review of Current Data Cleaning Methods and Tools -- 4.2.1. Data Wranglers -- 4.2.2. Data Cleaning Tools for Specific EHR Datasets -- 4.2.3. Data Quality Assessment -- 4.3. Common EHR Data Errors and Fixing Methods -- 4.3.1. List of Common Errors in an EHR Database -- 4.3.2. Demographics Table -- Multiple Race and Gender -- Multiple Patient Keys for the Same Encounter ID -- Multiple Calculated Birth Date -- 4.3.3. Lab Table -- Developing Conversion Map -- Conversion Map ID -- Convert To -- Conversion Equation -- The Lower Limit and Upper Limit -- Lab Date and Time -- User Input Form and Report Generator -- Output -- 4.3.4. Clinical Event Table -- Variable Combining -- Information Recovery -- A Case Study -- Overlap of Different Tables -- Correction of Misinformation -- 4.3.5. Diagnosis and Medication Table -- 4.3.6. Procedure Table -- Introduction to the Procedure Code Data -- Procedure Table Data Cleaning 4.4. Discussion -- Acknowledgments -- Notes -- References -- 5. EHR Data Pre-Processing and Preparation -- 5.1. Introduction -- 5.1.1. Definition of Data Pre-Processing/Processing -- 5.1.2. Definition of Data Preparation -- 5.2. Data Pre-Processing -- 5.2.1. Tidy Data Principles -- Variable Encoding -- 5.2.2. Feature Extraction: Derived Variables -- 5.2.3. Dimension Reduction -- Variable Grouping or Clustering -- Principle Component Analysis (PCA) -- Embedding and Deep Learning -- 5.2.4. Missing Data Imputation -- 5.3. Data Preparation -- 5.3.1. Define the Endpoint or Outcome -- 5.3.2. Process Medical Record Timestamps -- 5.3.3. Define the Encounter Time Interval -- 5.3.4. Encounter Combination -- 5.3.5. Define Comparison Groups -- 5.3.6. Cohort Refining -- 5.3.7. Leakage Detection -- 5.3.8. Data Preparation for Different Analysis Purposes -- 5.4. Data Processing/Preparation Errors and Pitfalls with Solutions -- 5.5. Data Pre-Processing and Preparation Report -- 5.6. Summary -- References -- 6. Missing Data Issues in EHR -- 6.1. Introduction and Overview -- 6.2. Missing Data Mechanisms -- 6.3. Methods for Incomplete EHR Data -- 6.3.1. Naïve Method -- 6.3.2. Imputation Using Statistical Models -- 6.3.3. Machine Learning and Deep Learning Models -- 6.3.4. Choice of Best Method for EHR Data -- 6.4. Case Study -- 6.4.1. Missing Condition in EHR Data -- 6.4.2. Missing Imputation in EHR Datasets -- 6.4.3. Evaluating the Performance of Imputation Methods and Thresholds -- 6.5. Discussion and Conclusion -- References -- 7. Causal Inference and Analysis for EHR Data -- 7.1. Introduction -- 7.1.1. Why Causal Inference -- 7.1.2. Overview of Causal Inference Methods: Rubin Causal Model (RCM) -- 7.1.3. Basic Framework in Causality: Potential Outcome Framework -- Average and Individual Treatment Effects -- 7.2. Propensity Scoring -- 7.2.1. Brief Introduction 7.2.2. Propensity Scoring for Binary Treatments -- 7.2.3. Propensity Scoring for Multiple Treatments -- 7.2.4. Propensity Scoring for Ordinal Treatments -- 7.2.5. Propensity Score Estimation for Complex Data Sets -- 7.2.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 7.3. Mediation Analysis -- 7.3.1. Introduction to Mediation Analysis -- 7.3.2. The Product Method -- 7.3.3. The Difference Method -- 7.3.4. Other Considerations -- 7.4. Instrumental Variables Networks for Treatment Effect Estimation in the Presence of Unmeasured Confounders -- 7.4.1. Instrumental Variables Frameworks -- 7.4.2. Two-Stage Least Square Methods with Linear Models -- Simple Linear Models -- Covariance Analysis -- Generalized Least Square Estimator -- Two-Stage Least Square Method -- Nonlinear Models for Two-Stage Least Squares Approach -- 7.5. Learning Treatment Effect by Generative Adversarial Networks -- 7.5.1. Introduction -- 7.5.2. CGANs as a General Framework for Estimation of Individualized Treatment Effects -- The Architecture of CGANs for Generating Potential Outcomes -- CGANs for Estimating ITEs -- CGANs for Estimating ITEs in Survival Analysis -- 7.5.3. Wasserstein GANs for Estimation of Individualized Treatment Effects -- 7.5.4. MisCGANs for Estimation of Individualized Treatment Effects -- The General Process for Incompletely Observed Data -- MisGAN for Counterfactual Imputation -- 7.5.5. Optimal Treatment Selection -- Sparse Techniques for Biomarker Identification -- Biomarker Identification for Optimal Treatment Selection -- 7.6. Deconfounder in Estimation of Treatment Effects -- 7.6.1. Introduction -- 7.6.2. Causal Models with Latent Confounders -- 7.6.3. Adversarial Learning Confounders -- 7.6.4. Loss Function and Optimization for Estimating ITEs in the Presence of Confounders -- 7.7. Targeted Maximum Likelihood Estimation 7.8. Supplementary Note A -- 7.8.1. Wasserstein GAN -- A1 Different Distances -- A1.1 Maximum Likelihood Estimation -- A1.2 Total Variation (TV) Distance -- A1.3 The Kullback-Leibler (KL) Divergence -- A1.4 The Jenson-Shannon (JS) Divergence -- A1.5 Earth Mover (EM) or Wasserstein Distance -- A2 Wasserstein GAN -- A3 Algorithm (WGAN) -- References -- 8. EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods -- 8.1. Introduction -- 8.1.1. Statistical Challenges for EHR Data -- 8.1.2. Overview of Existing Methods -- 8.2. Data Exploration and Visualization -- 8.3. Statistical Models for EHR Data -- 8.3.1. Contingency Tables -- 8.3.2. Chi-Square Test -- 8.3.3. Hypergeometric Test -- 8.4. GLM -- 8.5. Survival Model -- 8.6. Mixed-Effect Models -- 8.7. Time Series Analysis -- 8.7.1. AR, MA and ARMA Model -- 8.7.2. Gaussian Process -- 8.8. Variable Selection Methods -- 8.8.1. Stepwise Variable Selection -- 8.8.2. Purposeful Variable Selection -- 8.8.3. SIS -- 8.8.4. Penalty-Based Methods -- 8.9. Divide-and-Conquer Method -- 8.10. Validation -- 8.11. Results and Examples -- 8.12. Discussions and Conclusions -- References -- 9. Neural Network and Deep Learning Methods for EHR Data -- 9.1. Introduction -- 9.2. Deep Learning Methods for EHR Data -- 9.3. Deep Learning Software Tools and Implementation -- 9.4. Application Examples -- Case Study 1: Application of MLP for Mortality Prediction -- Case Study 2: Application of RNN for Heart Failure Prediction for Hypertension Patients -- Experimental Setting -- RNN Prediction Results -- 9.5. Discussion -- References -- 10. EHR Data Analytics and Predictions: Machine Learning Methods -- 10.1. Machine Learning Overview -- 10.2. Machine Learning Methods -- Random Forest -- Extremely Randomized Tree -- Gradient Boosting -- XgBoost -- Support Vector Machine (SVM) 10.3. Machine Learning Software Tools |
ctrlnum | (ZDB-30-PQE)EBC6378532 (ZDB-30-PAD)EBC6378532 (ZDB-89-EBL)EBL6378532 (OCoLC)1314898362 (DE-599)BVBBV047688347 |
edition | First edition |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>11266nmm a2200517zc 4500</leader><controlfield tag="001">BV047688347</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20220429 </controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">220118s2021 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781000260946</subfield><subfield code="9">978-1-00-026094-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781003030003</subfield><subfield code="9">978-1-003-03000-3</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781000260960</subfield><subfield code="9">978-1-00-026096-0</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781000260953</subfield><subfield code="9">978-1-00-026095-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC6378532</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PAD)EBC6378532</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-89-EBL)EBL6378532</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1314898362</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047688347</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Statistics and machine learning methods for EHR data</subfield><subfield code="b">from data extraction to data analytics</subfield><subfield code="c">edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boca Raton ; London ; New York</subfield><subfield code="b">CRC Press</subfield><subfield code="c">2021</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2021</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Chapman & Hall/CRC Healthcare informatics series</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Description based on publisher supplied metadata and other sources</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Intro -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- About the Editors -- Contributors -- 1. Introduction: Use of EHR Data for Scientific Discoveries--Challenges and Opportunities -- 1.1. Real-World Data and Real-World Evidence: Big Data in Practice -- 1.2. Use of EMR/EHR Database for Research and Scientific Discoveries: Procedure and Life Cycle -- 1.2.1. Initiate a Project -- 1.2.2. Data Queries and Data Extraction -- 1.2.3. Data Cleaning -- 1.2.4. Data Pre-Processing or Processing -- 1.2.5. Data Preparation -- 1.2.6. Data Analysis, Modeling and Prediction -- 1.2.7. Result Validation -- 1.2.8. Result Interpretation -- 1.2.9. Publication and Dissemination -- 1.3. Challenges and Opportunities -- References -- 2. EHR Project Management -- 2.1. Introduction -- 2.1.1. What is Project Management? -- 2.1.2. Why We Need Project Management? -- 2.1.3. Project Management Goals and Principles -- 2.2. Project and Sub-Project in EHR Research -- 2.3. Data, Code and Product Management -- 2.3.1. Data Loss Prevention -- 2.3.2. Naming Conventions -- 2.3.3. Version Control -- 2.3.4. Coding Convention -- Object-Oriented or Non-Object-Oriented Programming -- 2.3.5. Document Management: Data Analysis Report, Papers and Read-Me Documents -- 2.4. Team/People Management -- 2.4.1. How to Form a Team: What Expertise is Needed for EHR Projects? -- 2.4.2. How to Efficiently Manage a Multidisciplinary Team? -- 2.4.3. Task Management -- 2.5. Management Methods and Software Tools -- 2.6. An Example of a Data Management Framework -- 2.6.1. Folder Management -- Naming -- Structure -- Main Folders -- CBD_HS -- Public_Folder -- Admin -- Useful_Info -- Group Folders -- Project Folders -- Sub_Project Folders -- 2.6.2. File Management -- Naming -- Structure -- File Submission -- 2.6.3. User Management -- User Groups</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">2.6.4. Data Management Framework -- 2.7. Discussion and Summary -- 2.8. Appendix--File Submission Form -- Note -- References -- 3. EHR Databases and Data Management: Data Query and Extraction -- 3.1. Introduction -- 3.2. EHR/EMR Database Availability and Access -- 3.3. EHR/EMR Database Design and Structure: Database Queries -- 3.3.1. Database Construction -- 3.3.2. Traditional Relational Database System -- 3.3.3. Distributed Database System -- 3.4. Data Extraction -- 3.4.1. Define Inclusion/Exclusion Criteria for Data Extraction -- 3.4.2. Phenotyping: Cohort Identification -- 3.5. Data Extraction Report -- 3.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 3.6.1. EHR Database Design and Construction -- 3.6.2. SAH Cohort Identification and Data Extraction -- 3.6.3. Data Extraction Report -- 3.6.4. Potential Data Extraction Pitfalls and Errors with Solutions -- References -- 4. EHR Data Cleaning -- 4.1. Introduction -- 4.2. Review of Current Data Cleaning Methods and Tools -- 4.2.1. Data Wranglers -- 4.2.2. Data Cleaning Tools for Specific EHR Datasets -- 4.2.3. Data Quality Assessment -- 4.3. Common EHR Data Errors and Fixing Methods -- 4.3.1. List of Common Errors in an EHR Database -- 4.3.2. Demographics Table -- Multiple Race and Gender -- Multiple Patient Keys for the Same Encounter ID -- Multiple Calculated Birth Date -- 4.3.3. Lab Table -- Developing Conversion Map -- Conversion Map ID -- Convert To -- Conversion Equation -- The Lower Limit and Upper Limit -- Lab Date and Time -- User Input Form and Report Generator -- Output -- 4.3.4. Clinical Event Table -- Variable Combining -- Information Recovery -- A Case Study -- Overlap of Different Tables -- Correction of Misinformation -- 4.3.5. Diagnosis and Medication Table -- 4.3.6. Procedure Table -- Introduction to the Procedure Code Data -- Procedure Table Data Cleaning</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">4.4. Discussion -- Acknowledgments -- Notes -- References -- 5. EHR Data Pre-Processing and Preparation -- 5.1. Introduction -- 5.1.1. Definition of Data Pre-Processing/Processing -- 5.1.2. Definition of Data Preparation -- 5.2. Data Pre-Processing -- 5.2.1. Tidy Data Principles -- Variable Encoding -- 5.2.2. Feature Extraction: Derived Variables -- 5.2.3. Dimension Reduction -- Variable Grouping or Clustering -- Principle Component Analysis (PCA) -- Embedding and Deep Learning -- 5.2.4. Missing Data Imputation -- 5.3. Data Preparation -- 5.3.1. Define the Endpoint or Outcome -- 5.3.2. Process Medical Record Timestamps -- 5.3.3. Define the Encounter Time Interval -- 5.3.4. Encounter Combination -- 5.3.5. Define Comparison Groups -- 5.3.6. Cohort Refining -- 5.3.7. Leakage Detection -- 5.3.8. Data Preparation for Different Analysis Purposes -- 5.4. Data Processing/Preparation Errors and Pitfalls with Solutions -- 5.5. Data Pre-Processing and Preparation Report -- 5.6. Summary -- References -- 6. Missing Data Issues in EHR -- 6.1. Introduction and Overview -- 6.2. Missing Data Mechanisms -- 6.3. Methods for Incomplete EHR Data -- 6.3.1. Naïve Method -- 6.3.2. Imputation Using Statistical Models -- 6.3.3. Machine Learning and Deep Learning Models -- 6.3.4. Choice of Best Method for EHR Data -- 6.4. Case Study -- 6.4.1. Missing Condition in EHR Data -- 6.4.2. Missing Imputation in EHR Datasets -- 6.4.3. Evaluating the Performance of Imputation Methods and Thresholds -- 6.5. Discussion and Conclusion -- References -- 7. Causal Inference and Analysis for EHR Data -- 7.1. Introduction -- 7.1.1. Why Causal Inference -- 7.1.2. Overview of Causal Inference Methods: Rubin Causal Model (RCM) -- 7.1.3. Basic Framework in Causality: Potential Outcome Framework -- Average and Individual Treatment Effects -- 7.2. Propensity Scoring -- 7.2.1. Brief Introduction</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">7.2.2. Propensity Scoring for Binary Treatments -- 7.2.3. Propensity Scoring for Multiple Treatments -- 7.2.4. Propensity Scoring for Ordinal Treatments -- 7.2.5. Propensity Score Estimation for Complex Data Sets -- 7.2.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 7.3. Mediation Analysis -- 7.3.1. Introduction to Mediation Analysis -- 7.3.2. The Product Method -- 7.3.3. The Difference Method -- 7.3.4. Other Considerations -- 7.4. Instrumental Variables Networks for Treatment Effect Estimation in the Presence of Unmeasured Confounders -- 7.4.1. Instrumental Variables Frameworks -- 7.4.2. Two-Stage Least Square Methods with Linear Models -- Simple Linear Models -- Covariance Analysis -- Generalized Least Square Estimator -- Two-Stage Least Square Method -- Nonlinear Models for Two-Stage Least Squares Approach -- 7.5. Learning Treatment Effect by Generative Adversarial Networks -- 7.5.1. Introduction -- 7.5.2. CGANs as a General Framework for Estimation of Individualized Treatment Effects -- The Architecture of CGANs for Generating Potential Outcomes -- CGANs for Estimating ITEs -- CGANs for Estimating ITEs in Survival Analysis -- 7.5.3. Wasserstein GANs for Estimation of Individualized Treatment Effects -- 7.5.4. MisCGANs for Estimation of Individualized Treatment Effects -- The General Process for Incompletely Observed Data -- MisGAN for Counterfactual Imputation -- 7.5.5. Optimal Treatment Selection -- Sparse Techniques for Biomarker Identification -- Biomarker Identification for Optimal Treatment Selection -- 7.6. Deconfounder in Estimation of Treatment Effects -- 7.6.1. Introduction -- 7.6.2. Causal Models with Latent Confounders -- 7.6.3. Adversarial Learning Confounders -- 7.6.4. Loss Function and Optimization for Estimating ITEs in the Presence of Confounders -- 7.7. Targeted Maximum Likelihood Estimation</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">7.8. Supplementary Note A -- 7.8.1. Wasserstein GAN -- A1 Different Distances -- A1.1 Maximum Likelihood Estimation -- A1.2 Total Variation (TV) Distance -- A1.3 The Kullback-Leibler (KL) Divergence -- A1.4 The Jenson-Shannon (JS) Divergence -- A1.5 Earth Mover (EM) or Wasserstein Distance -- A2 Wasserstein GAN -- A3 Algorithm (WGAN) -- References -- 8. EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods -- 8.1. Introduction -- 8.1.1. Statistical Challenges for EHR Data -- 8.1.2. Overview of Existing Methods -- 8.2. Data Exploration and Visualization -- 8.3. Statistical Models for EHR Data -- 8.3.1. Contingency Tables -- 8.3.2. Chi-Square Test -- 8.3.3. Hypergeometric Test -- 8.4. GLM -- 8.5. Survival Model -- 8.6. Mixed-Effect Models -- 8.7. Time Series Analysis -- 8.7.1. AR, MA and ARMA Model -- 8.7.2. Gaussian Process -- 8.8. Variable Selection Methods -- 8.8.1. Stepwise Variable Selection -- 8.8.2. Purposeful Variable Selection -- 8.8.3. SIS -- 8.8.4. Penalty-Based Methods -- 8.9. Divide-and-Conquer Method -- 8.10. Validation -- 8.11. Results and Examples -- 8.12. Discussions and Conclusions -- References -- 9. Neural Network and Deep Learning Methods for EHR Data -- 9.1. Introduction -- 9.2. Deep Learning Methods for EHR Data -- 9.3. Deep Learning Software Tools and Implementation -- 9.4. Application Examples -- Case Study 1: Application of MLP for Mortality Prediction -- Case Study 2: Application of RNN for Heart Failure Prediction for Hypertension Patients -- Experimental Setting -- RNN Prediction Results -- 9.5. Discussion -- References -- 10. EHR Data Analytics and Predictions: Machine Learning Methods -- 10.1. Machine Learning Overview -- 10.2. Machine Learning Methods -- Random Forest -- Extremely Randomized Tree -- Gradient Boosting -- XgBoost -- Support Vector Machine (SVM)</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">10.3. Machine Learning Software Tools</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wu, Hulin</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yamal, Jose-Miguel</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yaseen, Ashraf</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Maroufy, Vahed</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="a">Wu, Hulin</subfield><subfield code="t">Statistics and Machine Learning Methods for EHR Data</subfield><subfield code="d">Milton : CRC Press LLC,c2020</subfield><subfield code="n">Druck-Ausgabe, Hardcover</subfield><subfield code="z">978-0-367-44239-2</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield><subfield code="a">ZDB-4-NLEBK</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-033072363</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2659670</subfield><subfield code="l">TUM01</subfield><subfield code="p">ZDB-4-NLEBK</subfield><subfield code="q">TUM_PDA_EBSCOMED_Kauf</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV047688347 |
illustrated | Not Illustrated |
index_date | 2024-07-03T18:57:01Z |
indexdate | 2024-07-10T09:19:15Z |
institution | BVB |
isbn | 9781000260946 9781003030003 9781000260960 9781000260953 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-033072363 |
oclc_num | 1314898362 |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource Illustrationen, Diagramme |
psigel | ZDB-30-PQE ZDB-4-NLEBK ZDB-4-NLEBK TUM_PDA_EBSCOMED_Kauf |
publishDate | 2021 |
publishDateSearch | 2021 |
publishDateSort | 2021 |
publisher | CRC Press |
record_format | marc |
series2 | Chapman & Hall/CRC Healthcare informatics series |
spelling | Statistics and machine learning methods for EHR data from data extraction to data analytics edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy First edition Boca Raton ; London ; New York CRC Press 2021 © 2021 1 Online-Ressource Illustrationen, Diagramme txt rdacontent c rdamedia cr rdacarrier Chapman & Hall/CRC Healthcare informatics series Description based on publisher supplied metadata and other sources Intro -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- About the Editors -- Contributors -- 1. Introduction: Use of EHR Data for Scientific Discoveries--Challenges and Opportunities -- 1.1. Real-World Data and Real-World Evidence: Big Data in Practice -- 1.2. Use of EMR/EHR Database for Research and Scientific Discoveries: Procedure and Life Cycle -- 1.2.1. Initiate a Project -- 1.2.2. Data Queries and Data Extraction -- 1.2.3. Data Cleaning -- 1.2.4. Data Pre-Processing or Processing -- 1.2.5. Data Preparation -- 1.2.6. Data Analysis, Modeling and Prediction -- 1.2.7. Result Validation -- 1.2.8. Result Interpretation -- 1.2.9. Publication and Dissemination -- 1.3. Challenges and Opportunities -- References -- 2. EHR Project Management -- 2.1. Introduction -- 2.1.1. What is Project Management? -- 2.1.2. Why We Need Project Management? -- 2.1.3. Project Management Goals and Principles -- 2.2. Project and Sub-Project in EHR Research -- 2.3. Data, Code and Product Management -- 2.3.1. Data Loss Prevention -- 2.3.2. Naming Conventions -- 2.3.3. Version Control -- 2.3.4. Coding Convention -- Object-Oriented or Non-Object-Oriented Programming -- 2.3.5. Document Management: Data Analysis Report, Papers and Read-Me Documents -- 2.4. Team/People Management -- 2.4.1. How to Form a Team: What Expertise is Needed for EHR Projects? -- 2.4.2. How to Efficiently Manage a Multidisciplinary Team? -- 2.4.3. Task Management -- 2.5. Management Methods and Software Tools -- 2.6. An Example of a Data Management Framework -- 2.6.1. Folder Management -- Naming -- Structure -- Main Folders -- CBD_HS -- Public_Folder -- Admin -- Useful_Info -- Group Folders -- Project Folders -- Sub_Project Folders -- 2.6.2. File Management -- Naming -- Structure -- File Submission -- 2.6.3. User Management -- User Groups 2.6.4. Data Management Framework -- 2.7. Discussion and Summary -- 2.8. Appendix--File Submission Form -- Note -- References -- 3. EHR Databases and Data Management: Data Query and Extraction -- 3.1. Introduction -- 3.2. EHR/EMR Database Availability and Access -- 3.3. EHR/EMR Database Design and Structure: Database Queries -- 3.3.1. Database Construction -- 3.3.2. Traditional Relational Database System -- 3.3.3. Distributed Database System -- 3.4. Data Extraction -- 3.4.1. Define Inclusion/Exclusion Criteria for Data Extraction -- 3.4.2. Phenotyping: Cohort Identification -- 3.5. Data Extraction Report -- 3.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 3.6.1. EHR Database Design and Construction -- 3.6.2. SAH Cohort Identification and Data Extraction -- 3.6.3. Data Extraction Report -- 3.6.4. Potential Data Extraction Pitfalls and Errors with Solutions -- References -- 4. EHR Data Cleaning -- 4.1. Introduction -- 4.2. Review of Current Data Cleaning Methods and Tools -- 4.2.1. Data Wranglers -- 4.2.2. Data Cleaning Tools for Specific EHR Datasets -- 4.2.3. Data Quality Assessment -- 4.3. Common EHR Data Errors and Fixing Methods -- 4.3.1. List of Common Errors in an EHR Database -- 4.3.2. Demographics Table -- Multiple Race and Gender -- Multiple Patient Keys for the Same Encounter ID -- Multiple Calculated Birth Date -- 4.3.3. Lab Table -- Developing Conversion Map -- Conversion Map ID -- Convert To -- Conversion Equation -- The Lower Limit and Upper Limit -- Lab Date and Time -- User Input Form and Report Generator -- Output -- 4.3.4. Clinical Event Table -- Variable Combining -- Information Recovery -- A Case Study -- Overlap of Different Tables -- Correction of Misinformation -- 4.3.5. Diagnosis and Medication Table -- 4.3.6. Procedure Table -- Introduction to the Procedure Code Data -- Procedure Table Data Cleaning 4.4. Discussion -- Acknowledgments -- Notes -- References -- 5. EHR Data Pre-Processing and Preparation -- 5.1. Introduction -- 5.1.1. Definition of Data Pre-Processing/Processing -- 5.1.2. Definition of Data Preparation -- 5.2. Data Pre-Processing -- 5.2.1. Tidy Data Principles -- Variable Encoding -- 5.2.2. Feature Extraction: Derived Variables -- 5.2.3. Dimension Reduction -- Variable Grouping or Clustering -- Principle Component Analysis (PCA) -- Embedding and Deep Learning -- 5.2.4. Missing Data Imputation -- 5.3. Data Preparation -- 5.3.1. Define the Endpoint or Outcome -- 5.3.2. Process Medical Record Timestamps -- 5.3.3. Define the Encounter Time Interval -- 5.3.4. Encounter Combination -- 5.3.5. Define Comparison Groups -- 5.3.6. Cohort Refining -- 5.3.7. Leakage Detection -- 5.3.8. Data Preparation for Different Analysis Purposes -- 5.4. Data Processing/Preparation Errors and Pitfalls with Solutions -- 5.5. Data Pre-Processing and Preparation Report -- 5.6. Summary -- References -- 6. Missing Data Issues in EHR -- 6.1. Introduction and Overview -- 6.2. Missing Data Mechanisms -- 6.3. Methods for Incomplete EHR Data -- 6.3.1. Naïve Method -- 6.3.2. Imputation Using Statistical Models -- 6.3.3. Machine Learning and Deep Learning Models -- 6.3.4. Choice of Best Method for EHR Data -- 6.4. Case Study -- 6.4.1. Missing Condition in EHR Data -- 6.4.2. Missing Imputation in EHR Datasets -- 6.4.3. Evaluating the Performance of Imputation Methods and Thresholds -- 6.5. Discussion and Conclusion -- References -- 7. Causal Inference and Analysis for EHR Data -- 7.1. Introduction -- 7.1.1. Why Causal Inference -- 7.1.2. Overview of Causal Inference Methods: Rubin Causal Model (RCM) -- 7.1.3. Basic Framework in Causality: Potential Outcome Framework -- Average and Individual Treatment Effects -- 7.2. Propensity Scoring -- 7.2.1. Brief Introduction 7.2.2. Propensity Scoring for Binary Treatments -- 7.2.3. Propensity Scoring for Multiple Treatments -- 7.2.4. Propensity Scoring for Ordinal Treatments -- 7.2.5. Propensity Score Estimation for Complex Data Sets -- 7.2.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 7.3. Mediation Analysis -- 7.3.1. Introduction to Mediation Analysis -- 7.3.2. The Product Method -- 7.3.3. The Difference Method -- 7.3.4. Other Considerations -- 7.4. Instrumental Variables Networks for Treatment Effect Estimation in the Presence of Unmeasured Confounders -- 7.4.1. Instrumental Variables Frameworks -- 7.4.2. Two-Stage Least Square Methods with Linear Models -- Simple Linear Models -- Covariance Analysis -- Generalized Least Square Estimator -- Two-Stage Least Square Method -- Nonlinear Models for Two-Stage Least Squares Approach -- 7.5. Learning Treatment Effect by Generative Adversarial Networks -- 7.5.1. Introduction -- 7.5.2. CGANs as a General Framework for Estimation of Individualized Treatment Effects -- The Architecture of CGANs for Generating Potential Outcomes -- CGANs for Estimating ITEs -- CGANs for Estimating ITEs in Survival Analysis -- 7.5.3. Wasserstein GANs for Estimation of Individualized Treatment Effects -- 7.5.4. MisCGANs for Estimation of Individualized Treatment Effects -- The General Process for Incompletely Observed Data -- MisGAN for Counterfactual Imputation -- 7.5.5. Optimal Treatment Selection -- Sparse Techniques for Biomarker Identification -- Biomarker Identification for Optimal Treatment Selection -- 7.6. Deconfounder in Estimation of Treatment Effects -- 7.6.1. Introduction -- 7.6.2. Causal Models with Latent Confounders -- 7.6.3. Adversarial Learning Confounders -- 7.6.4. Loss Function and Optimization for Estimating ITEs in the Presence of Confounders -- 7.7. Targeted Maximum Likelihood Estimation 7.8. Supplementary Note A -- 7.8.1. Wasserstein GAN -- A1 Different Distances -- A1.1 Maximum Likelihood Estimation -- A1.2 Total Variation (TV) Distance -- A1.3 The Kullback-Leibler (KL) Divergence -- A1.4 The Jenson-Shannon (JS) Divergence -- A1.5 Earth Mover (EM) or Wasserstein Distance -- A2 Wasserstein GAN -- A3 Algorithm (WGAN) -- References -- 8. EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods -- 8.1. Introduction -- 8.1.1. Statistical Challenges for EHR Data -- 8.1.2. Overview of Existing Methods -- 8.2. Data Exploration and Visualization -- 8.3. Statistical Models for EHR Data -- 8.3.1. Contingency Tables -- 8.3.2. Chi-Square Test -- 8.3.3. Hypergeometric Test -- 8.4. GLM -- 8.5. Survival Model -- 8.6. Mixed-Effect Models -- 8.7. Time Series Analysis -- 8.7.1. AR, MA and ARMA Model -- 8.7.2. Gaussian Process -- 8.8. Variable Selection Methods -- 8.8.1. Stepwise Variable Selection -- 8.8.2. Purposeful Variable Selection -- 8.8.3. SIS -- 8.8.4. Penalty-Based Methods -- 8.9. Divide-and-Conquer Method -- 8.10. Validation -- 8.11. Results and Examples -- 8.12. Discussions and Conclusions -- References -- 9. Neural Network and Deep Learning Methods for EHR Data -- 9.1. Introduction -- 9.2. Deep Learning Methods for EHR Data -- 9.3. Deep Learning Software Tools and Implementation -- 9.4. Application Examples -- Case Study 1: Application of MLP for Mortality Prediction -- Case Study 2: Application of RNN for Heart Failure Prediction for Hypertension Patients -- Experimental Setting -- RNN Prediction Results -- 9.5. Discussion -- References -- 10. EHR Data Analytics and Predictions: Machine Learning Methods -- 10.1. Machine Learning Overview -- 10.2. Machine Learning Methods -- Random Forest -- Extremely Randomized Tree -- Gradient Boosting -- XgBoost -- Support Vector Machine (SVM) 10.3. Machine Learning Software Tools Wu, Hulin edt Yamal, Jose-Miguel edt Yaseen, Ashraf edt Maroufy, Vahed edt Erscheint auch als Wu, Hulin Statistics and Machine Learning Methods for EHR Data Milton : CRC Press LLC,c2020 Druck-Ausgabe, Hardcover 978-0-367-44239-2 |
spellingShingle | Statistics and machine learning methods for EHR data from data extraction to data analytics Intro -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- About the Editors -- Contributors -- 1. Introduction: Use of EHR Data for Scientific Discoveries--Challenges and Opportunities -- 1.1. Real-World Data and Real-World Evidence: Big Data in Practice -- 1.2. Use of EMR/EHR Database for Research and Scientific Discoveries: Procedure and Life Cycle -- 1.2.1. Initiate a Project -- 1.2.2. Data Queries and Data Extraction -- 1.2.3. Data Cleaning -- 1.2.4. Data Pre-Processing or Processing -- 1.2.5. Data Preparation -- 1.2.6. Data Analysis, Modeling and Prediction -- 1.2.7. Result Validation -- 1.2.8. Result Interpretation -- 1.2.9. Publication and Dissemination -- 1.3. Challenges and Opportunities -- References -- 2. EHR Project Management -- 2.1. Introduction -- 2.1.1. What is Project Management? -- 2.1.2. Why We Need Project Management? -- 2.1.3. Project Management Goals and Principles -- 2.2. Project and Sub-Project in EHR Research -- 2.3. Data, Code and Product Management -- 2.3.1. Data Loss Prevention -- 2.3.2. Naming Conventions -- 2.3.3. Version Control -- 2.3.4. Coding Convention -- Object-Oriented or Non-Object-Oriented Programming -- 2.3.5. Document Management: Data Analysis Report, Papers and Read-Me Documents -- 2.4. Team/People Management -- 2.4.1. How to Form a Team: What Expertise is Needed for EHR Projects? -- 2.4.2. How to Efficiently Manage a Multidisciplinary Team? -- 2.4.3. Task Management -- 2.5. Management Methods and Software Tools -- 2.6. An Example of a Data Management Framework -- 2.6.1. Folder Management -- Naming -- Structure -- Main Folders -- CBD_HS -- Public_Folder -- Admin -- Useful_Info -- Group Folders -- Project Folders -- Sub_Project Folders -- 2.6.2. File Management -- Naming -- Structure -- File Submission -- 2.6.3. User Management -- User Groups 2.6.4. Data Management Framework -- 2.7. Discussion and Summary -- 2.8. Appendix--File Submission Form -- Note -- References -- 3. EHR Databases and Data Management: Data Query and Extraction -- 3.1. Introduction -- 3.2. EHR/EMR Database Availability and Access -- 3.3. EHR/EMR Database Design and Structure: Database Queries -- 3.3.1. Database Construction -- 3.3.2. Traditional Relational Database System -- 3.3.3. Distributed Database System -- 3.4. Data Extraction -- 3.4.1. Define Inclusion/Exclusion Criteria for Data Extraction -- 3.4.2. Phenotyping: Cohort Identification -- 3.5. Data Extraction Report -- 3.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 3.6.1. EHR Database Design and Construction -- 3.6.2. SAH Cohort Identification and Data Extraction -- 3.6.3. Data Extraction Report -- 3.6.4. Potential Data Extraction Pitfalls and Errors with Solutions -- References -- 4. EHR Data Cleaning -- 4.1. Introduction -- 4.2. Review of Current Data Cleaning Methods and Tools -- 4.2.1. Data Wranglers -- 4.2.2. Data Cleaning Tools for Specific EHR Datasets -- 4.2.3. Data Quality Assessment -- 4.3. Common EHR Data Errors and Fixing Methods -- 4.3.1. List of Common Errors in an EHR Database -- 4.3.2. Demographics Table -- Multiple Race and Gender -- Multiple Patient Keys for the Same Encounter ID -- Multiple Calculated Birth Date -- 4.3.3. Lab Table -- Developing Conversion Map -- Conversion Map ID -- Convert To -- Conversion Equation -- The Lower Limit and Upper Limit -- Lab Date and Time -- User Input Form and Report Generator -- Output -- 4.3.4. Clinical Event Table -- Variable Combining -- Information Recovery -- A Case Study -- Overlap of Different Tables -- Correction of Misinformation -- 4.3.5. Diagnosis and Medication Table -- 4.3.6. Procedure Table -- Introduction to the Procedure Code Data -- Procedure Table Data Cleaning 4.4. Discussion -- Acknowledgments -- Notes -- References -- 5. EHR Data Pre-Processing and Preparation -- 5.1. Introduction -- 5.1.1. Definition of Data Pre-Processing/Processing -- 5.1.2. Definition of Data Preparation -- 5.2. Data Pre-Processing -- 5.2.1. Tidy Data Principles -- Variable Encoding -- 5.2.2. Feature Extraction: Derived Variables -- 5.2.3. Dimension Reduction -- Variable Grouping or Clustering -- Principle Component Analysis (PCA) -- Embedding and Deep Learning -- 5.2.4. Missing Data Imputation -- 5.3. Data Preparation -- 5.3.1. Define the Endpoint or Outcome -- 5.3.2. Process Medical Record Timestamps -- 5.3.3. Define the Encounter Time Interval -- 5.3.4. Encounter Combination -- 5.3.5. Define Comparison Groups -- 5.3.6. Cohort Refining -- 5.3.7. Leakage Detection -- 5.3.8. Data Preparation for Different Analysis Purposes -- 5.4. Data Processing/Preparation Errors and Pitfalls with Solutions -- 5.5. Data Pre-Processing and Preparation Report -- 5.6. Summary -- References -- 6. Missing Data Issues in EHR -- 6.1. Introduction and Overview -- 6.2. Missing Data Mechanisms -- 6.3. Methods for Incomplete EHR Data -- 6.3.1. Naïve Method -- 6.3.2. Imputation Using Statistical Models -- 6.3.3. Machine Learning and Deep Learning Models -- 6.3.4. Choice of Best Method for EHR Data -- 6.4. Case Study -- 6.4.1. Missing Condition in EHR Data -- 6.4.2. Missing Imputation in EHR Datasets -- 6.4.3. Evaluating the Performance of Imputation Methods and Thresholds -- 6.5. Discussion and Conclusion -- References -- 7. Causal Inference and Analysis for EHR Data -- 7.1. Introduction -- 7.1.1. Why Causal Inference -- 7.1.2. Overview of Causal Inference Methods: Rubin Causal Model (RCM) -- 7.1.3. Basic Framework in Causality: Potential Outcome Framework -- Average and Individual Treatment Effects -- 7.2. Propensity Scoring -- 7.2.1. Brief Introduction 7.2.2. Propensity Scoring for Binary Treatments -- 7.2.3. Propensity Scoring for Multiple Treatments -- 7.2.4. Propensity Scoring for Ordinal Treatments -- 7.2.5. Propensity Score Estimation for Complex Data Sets -- 7.2.6. Illustration Example: Subarachnoid Hemorrhage (SAH) Project -- 7.3. Mediation Analysis -- 7.3.1. Introduction to Mediation Analysis -- 7.3.2. The Product Method -- 7.3.3. The Difference Method -- 7.3.4. Other Considerations -- 7.4. Instrumental Variables Networks for Treatment Effect Estimation in the Presence of Unmeasured Confounders -- 7.4.1. Instrumental Variables Frameworks -- 7.4.2. Two-Stage Least Square Methods with Linear Models -- Simple Linear Models -- Covariance Analysis -- Generalized Least Square Estimator -- Two-Stage Least Square Method -- Nonlinear Models for Two-Stage Least Squares Approach -- 7.5. Learning Treatment Effect by Generative Adversarial Networks -- 7.5.1. Introduction -- 7.5.2. CGANs as a General Framework for Estimation of Individualized Treatment Effects -- The Architecture of CGANs for Generating Potential Outcomes -- CGANs for Estimating ITEs -- CGANs for Estimating ITEs in Survival Analysis -- 7.5.3. Wasserstein GANs for Estimation of Individualized Treatment Effects -- 7.5.4. MisCGANs for Estimation of Individualized Treatment Effects -- The General Process for Incompletely Observed Data -- MisGAN for Counterfactual Imputation -- 7.5.5. Optimal Treatment Selection -- Sparse Techniques for Biomarker Identification -- Biomarker Identification for Optimal Treatment Selection -- 7.6. Deconfounder in Estimation of Treatment Effects -- 7.6.1. Introduction -- 7.6.2. Causal Models with Latent Confounders -- 7.6.3. Adversarial Learning Confounders -- 7.6.4. Loss Function and Optimization for Estimating ITEs in the Presence of Confounders -- 7.7. Targeted Maximum Likelihood Estimation 7.8. Supplementary Note A -- 7.8.1. Wasserstein GAN -- A1 Different Distances -- A1.1 Maximum Likelihood Estimation -- A1.2 Total Variation (TV) Distance -- A1.3 The Kullback-Leibler (KL) Divergence -- A1.4 The Jenson-Shannon (JS) Divergence -- A1.5 Earth Mover (EM) or Wasserstein Distance -- A2 Wasserstein GAN -- A3 Algorithm (WGAN) -- References -- 8. EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods -- 8.1. Introduction -- 8.1.1. Statistical Challenges for EHR Data -- 8.1.2. Overview of Existing Methods -- 8.2. Data Exploration and Visualization -- 8.3. Statistical Models for EHR Data -- 8.3.1. Contingency Tables -- 8.3.2. Chi-Square Test -- 8.3.3. Hypergeometric Test -- 8.4. GLM -- 8.5. Survival Model -- 8.6. Mixed-Effect Models -- 8.7. Time Series Analysis -- 8.7.1. AR, MA and ARMA Model -- 8.7.2. Gaussian Process -- 8.8. Variable Selection Methods -- 8.8.1. Stepwise Variable Selection -- 8.8.2. Purposeful Variable Selection -- 8.8.3. SIS -- 8.8.4. Penalty-Based Methods -- 8.9. Divide-and-Conquer Method -- 8.10. Validation -- 8.11. Results and Examples -- 8.12. Discussions and Conclusions -- References -- 9. Neural Network and Deep Learning Methods for EHR Data -- 9.1. Introduction -- 9.2. Deep Learning Methods for EHR Data -- 9.3. Deep Learning Software Tools and Implementation -- 9.4. Application Examples -- Case Study 1: Application of MLP for Mortality Prediction -- Case Study 2: Application of RNN for Heart Failure Prediction for Hypertension Patients -- Experimental Setting -- RNN Prediction Results -- 9.5. Discussion -- References -- 10. EHR Data Analytics and Predictions: Machine Learning Methods -- 10.1. Machine Learning Overview -- 10.2. Machine Learning Methods -- Random Forest -- Extremely Randomized Tree -- Gradient Boosting -- XgBoost -- Support Vector Machine (SVM) 10.3. Machine Learning Software Tools |
title | Statistics and machine learning methods for EHR data from data extraction to data analytics |
title_auth | Statistics and machine learning methods for EHR data from data extraction to data analytics |
title_exact_search | Statistics and machine learning methods for EHR data from data extraction to data analytics |
title_exact_search_txtP | Statistics and machine learning methods for EHR data from data extraction to data analytics |
title_full | Statistics and machine learning methods for EHR data from data extraction to data analytics edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy |
title_fullStr | Statistics and machine learning methods for EHR data from data extraction to data analytics edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy |
title_full_unstemmed | Statistics and machine learning methods for EHR data from data extraction to data analytics edited by Hulin Wu, Jose-Miguel Yamal, Ashraf Yaseen, and Vahed Maroufy |
title_short | Statistics and machine learning methods for EHR data |
title_sort | statistics and machine learning methods for ehr data from data extraction to data analytics |
title_sub | from data extraction to data analytics |
work_keys_str_mv | AT wuhulin statisticsandmachinelearningmethodsforehrdatafromdataextractiontodataanalytics AT yamaljosemiguel statisticsandmachinelearningmethodsforehrdatafromdataextractiontodataanalytics AT yaseenashraf statisticsandmachinelearningmethodsforehrdatafromdataextractiontodataanalytics AT maroufyvahed statisticsandmachinelearningmethodsforehrdatafromdataextractiontodataanalytics |