Applied statistical genetics with R: for population-based association studies
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Dordrecht [u.a.]
Springer
2009
|
Schriftenreihe: | Use R!
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXIII, 252 S. graph. Darst. |
ISBN: | 9780387895536 9780387895543 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV035556557 | ||
003 | DE-604 | ||
005 | 20191002 | ||
007 | t | ||
008 | 090609s2009 d||| |||| 00||| eng d | ||
015 | |a 09,N06,0751 |2 dnb | ||
016 | 7 | |a 992146372 |2 DE-101 | |
020 | |a 9780387895536 |c PB. : EUR 48.10 (freier Pr.), sfr 75.00 (freier Pr.) |9 978-0-387-89553-6 | ||
020 | |a 9780387895543 |9 978-0-387-89554-3 | ||
035 | |a (OCoLC)461253485 | ||
035 | |a (DE-599)DNB992146372 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-355 |a DE-20 |a DE-91G |a DE-29 |a DE-M49 |a DE-19 |a DE-91S | ||
050 | 0 | |a QH438.4.S73 | |
082 | 0 | |a 576.50727 |2 22 | |
084 | |a ST 250 |0 (DE-625)143626: |2 rvk | ||
084 | |a WC 7000 |0 (DE-625)148142: |2 rvk | ||
084 | |a BIO 107f |2 stub | ||
084 | |a 570 |2 sdnb | ||
084 | |a MAT 620f |2 stub | ||
084 | |a BIO 180f |2 stub | ||
084 | |a DAT 307f |2 stub | ||
100 | 1 | |a Foulkes, Andrea S. |e Verfasser |4 aut | |
245 | 1 | 0 | |a Applied statistical genetics with R |b for population-based association studies |c Andrea S. Foulkes |
264 | 1 | |a Dordrecht [u.a.] |b Springer |c 2009 | |
300 | |a XXIII, 252 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Use R! | |
650 | 4 | |a Automatic Data Processing | |
650 | 4 | |a Epidemiologic Methods | |
650 | 4 | |a Genetics, Population |x methods | |
650 | 4 | |a Models, Statistical | |
650 | 4 | |a Programming Languages | |
650 | 0 | 7 | |a Statistik |0 (DE-588)4056995-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Biostatistik |0 (DE-588)4729990-3 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Genetik |0 (DE-588)4071711-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a R |g Programm |0 (DE-588)4705956-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Biostatistik |0 (DE-588)4729990-3 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Genetik |0 (DE-588)4071711-2 |D s |
689 | 1 | 1 | |a Statistik |0 (DE-588)4056995-0 |D s |
689 | 1 | 2 | |a R |g Programm |0 (DE-588)4705956-4 |D s |
689 | 1 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017612360&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-017612360 |
Datensatz im Suchindex
_version_ | 1804139203694428160 |
---|---|
adam_text | Contents
Preface
........................................................
VII
List of Tables
.....................................................XVII
List of Figures
.....................................................XIX
Acronyms
.........................................................XXI
1
Genetic Association Studies
............................... 1
1.1
Overview of population-based investigations
................ 2
1.1.1
Types of investigations
............................. 2
1.1.2
Genotype versus gene expression
.................... 4
1.1.3
Population-
ver sus
family-based investigations
......... 6
1.1.4
Association versus population genetics
............... 7
1.2
Data components and terminology
......................... 7
1.2.1
Genetic information
............................... 8
1.2.2
Traits
............................................ 11
1.2.3
Covariates
........................................ 12
1.3
Data examples
.......................................... 12
1.3.1
Complex disease association studies
.................. 13
1.3.2 HIV
genotype association studies
.................... 16
1.3.3
Publicly available data used throughout the text
...... 18
Problems
................................................... 27
2
Elementary Statistical Principles
.......................... 29
2.1
Background
............................................. 30
2.1.1
Notation and basic probability concepts
.............. 30
2.1.2
Important epidemiological concepts
.................. 33
2.2
Measures and tests of association
.......................... 37
2.2.1
Contingency table analysis for a binary trait
.......... 38
2.2.2
M-sample tests for a quantitative trait
............... 44
XIII
XIV Contents
2.2.3
Generalized linear model
........................... 48
2.3
Analytic challenges
...................................... 55
2.3.1
Multiplicity and high dimensionality
................. 55
2.3.2
Missing and unobservable data considerations
......... 58
2.3.3
Race and ethnicity
................................ 60
2.3.4
Genetic models and models of association
............ 61
Problems
................................................... 62
3
Genetic Data Concepts and Tests
.......................... 65
3.1
Linkage disequilibrium (LD)
.............................. 65
3.1.1
Measures of LD: D and r2
......................... 66
3.1.2
LD blocks and
SNP
tagging
........................ 74
3.1.3
LD and population stratification
.................... 76
3.2
Hardy-
Weinberg
equilibrium (HWE)
....................... 78
3.2.1
Pearson s %2-test and Fisher s exact test
............. 78
3.2.2
HWE and population substructure
.................. 82
3.3
Quality control and preprocessing
......................... 86
3.3.1
SNP
chips
........................................ 86
3.3.2
Genotyping errors
................................. 88
3.3.3
Identifying population substructure
.................. 89
3.3.4
Relatedness
....................................... 92
3.3.5
Accounting for unobservable substructure
............ 94
Problems
................................................... 95
4
Multiple Comparison Procedures
.......................... 97
4.1
Measures of error
........................................ 97
4.1.1
Family-wise error rate
.............................. 98
4.1.2
False discovery rate
................................100
4.2
Single-step and step-down adjustments
.....................101
4.2.1
Bonferroni adjustment
.............................102
4.2.2
Tukey and Scheffe tests
............................105
4.2.3
False discovery rate control
.........................109
4.2.4
The g-value
.......................................112
4.3
Resampling-based methods
...............................114
4.3.1
Free step-down resampling
..........................114
4.3.2
Null unrestricted bootstrap
.........................120
4.4
Alternative paradigms
...................................123
4.4.1
Effective number of tests
...........................123
4.4.2
Global tests
......................................125
Problems
...................................................127
Contents
XV
5
Methods for Unobservable Phase
..........................129
5.1
Haplotype estimation
....................................130
5.1.1
An expectation-maximization algorithm
..............130
5.1.2
Bayesian haplotype reconstruction
...................137
5.2
Estimating and testing for haplotype-trait association
........140
5.2.1
Two-stage approaches
..............................140
5.2.2
A fully likelihood-based approach
....................145
Problems
...................................................149
Supplemental notes
..........................................150
Supplemental
R
scripts
.......................................155
6
Classification and Regression Trees
........................157
6.1
Building a tree
..........................................157
6.1.1
Recursive partitioning
..............................157
6.1.2
Splitting rules
.....................................158
6.1.3
Defining inputs
...................................167
6.2
Optimal trees
...........................................173
6.2.1
Honest estimates
..................................174
6.2.2
Cost-complexity pruning
...........................174
Problems
...................................................179
7
Additional Topics in High-Dimensional Data Analysis
.....181
7.1
Random forests
.........................................182
7.1.1
Variable importance
...............................183
7.1.2
Missing data methods
..............................187
7.1.3
Covariates
........................................198
7.2
Logic regression
.........................................198
7.3
Multivariate adaptive regression splines
....................205
7.4
Bayesian variable selection
................................209
7.5
Further readings
........................................211
Problems
...................................................212
Appendix
R
Basics
............................................213
A.I Getting started
..........................................213
A.
2
Types of data objects
....................................216
A.3 Importing data
..........................................220
A.4 Managing data
..........................................221
A.5 Installing packages
.......................................224
A.6 Additional help
.........................................225
References
.....................................................227
Glossary of Terms
.............................................237
Glossary of Select
R
Packages
.................................243
XVI Contents
Subject Index
.................................................247
Index of
R
Functions and Packages
............................251
List of Tables
1.1
Sample of FAMuSS data
.................................. 19
1.2
Sample of HGDP data
.................................... 24
1.3
Sample Virco data
........................................ 26
2.1 2x3
contingency table for genotype-disease association
....... 38
2.2 2x2
contingency table for genotype-disease association
....... 39
3.1
Expected
alíele
distributions under independence
............. 67
3.2
Observed
alíele
distributions under LD
...................... 67
3.3
Genotype counts for two biallelic loci
....................... 68
3.4
Haplotype distribution assuming linkage equilibrium and
varying
alíele
frequencies
.................................. 76
3.5
Apparent LD in the presence of population stratification
...... 77
3.6
Genotype counts for two homologous chromosomes
........... 79
3.7
Example of the effect of population admixture on HWE
....... 83
3.8
Genotype distributions for varying
alíele
frequencies
.......... 84
3.9
HWD in the presence of population stratification
............. 85
4.1
Туре
-l
and type-2 errors in hypothesis testing
............... 98
4.2
Errors for multiple hypothesis tests
......................... 99
6.1
Sample case-control data by genotype indicators
.............161
XVII
List of Figures
1.1
Marker SNPs
............................................ 3
1.2
Haplotype pairs corresponding to heterozygosity at two
SNP
loci
................................................ 10
1.3
Meiosis and recombination
................................. 15
1.4 HIV
life cycle
............................................ 17
2.1
Confounding
............................................. 34
2.2
Effect mediation
......................................... 36
2.3
Effect modification and conditional association
............... 37
2.4
Possible haplotype pairs corresponding to two SNPs
.......... 59
3.1
Map of pairwise LD
...................................... 71
3.2
Illustration of LD blocks and associated tag SNPs
............ 75
3.3
Application of MDS for identifying population substructure
.... 92
3.4
Application of PCA for identifying population substructure
.... 93
6.1
Tree structure
...........................................159
6.2
Classification tree for Example
6.2..........................164
6.3
Cost-complexity pruning for Example
6.5....................178
7.1
Ordered variable importance scores from random forest
.......186
7.2
Example boolean statement in logic regression
...............199
7.3
Single logic regression tree from Example
7.5 ................201
7.4
Sum of logic regression trees from Example
7.5...............202
7.5
Monte Carlo logic regression results from Example
7.6........204
XIX
|
any_adam_object | 1 |
author | Foulkes, Andrea S. |
author_facet | Foulkes, Andrea S. |
author_role | aut |
author_sort | Foulkes, Andrea S. |
author_variant | a s f as asf |
building | Verbundindex |
bvnumber | BV035556557 |
callnumber-first | Q - Science |
callnumber-label | QH438 |
callnumber-raw | QH438.4.S73 |
callnumber-search | QH438.4.S73 |
callnumber-sort | QH 3438.4 S73 |
callnumber-subject | QH - Natural History and Biology |
classification_rvk | ST 250 WC 7000 |
classification_tum | BIO 107f MAT 620f BIO 180f DAT 307f |
ctrlnum | (OCoLC)461253485 (DE-599)DNB992146372 |
dewey-full | 576.50727 |
dewey-hundreds | 500 - Natural sciences and mathematics |
dewey-ones | 576 - Genetics and evolution |
dewey-raw | 576.50727 |
dewey-search | 576.50727 |
dewey-sort | 3576.50727 |
dewey-tens | 570 - Biology |
discipline | Biologie Informatik Mathematik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02242nam a2200601 c 4500</leader><controlfield tag="001">BV035556557</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20191002 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">090609s2009 d||| |||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">09,N06,0751</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">992146372</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780387895536</subfield><subfield code="c">PB. : EUR 48.10 (freier Pr.), sfr 75.00 (freier Pr.)</subfield><subfield code="9">978-0-387-89553-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780387895543</subfield><subfield code="9">978-0-387-89554-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)461253485</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB992146372</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield><subfield code="a">DE-20</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-29</subfield><subfield code="a">DE-M49</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-91S</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">QH438.4.S73</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">576.50727</subfield><subfield code="2">22</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">WC 7000</subfield><subfield code="0">(DE-625)148142:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIO 107f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">570</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MAT 620f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIO 180f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 307f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Foulkes, Andrea S.</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Applied statistical genetics with R</subfield><subfield code="b">for population-based association studies</subfield><subfield code="c">Andrea S. Foulkes</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Dordrecht [u.a.]</subfield><subfield code="b">Springer</subfield><subfield code="c">2009</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXIII, 252 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Use R!</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Automatic Data Processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Epidemiologic Methods</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Genetics, Population</subfield><subfield code="x">methods</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Models, Statistical</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Programming Languages</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Biostatistik</subfield><subfield code="0">(DE-588)4729990-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Genetik</subfield><subfield code="0">(DE-588)4071711-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Biostatistik</subfield><subfield code="0">(DE-588)4729990-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Genetik</subfield><subfield code="0">(DE-588)4071711-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="1"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="2"><subfield code="a">R</subfield><subfield code="g">Programm</subfield><subfield code="0">(DE-588)4705956-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017612360&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-017612360</subfield></datafield></record></collection> |
id | DE-604.BV035556557 |
illustrated | Illustrated |
indexdate | 2024-07-09T21:40:21Z |
institution | BVB |
isbn | 9780387895536 9780387895543 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-017612360 |
oclc_num | 461253485 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR DE-20 DE-91G DE-BY-TUM DE-29 DE-M49 DE-BY-TUM DE-19 DE-BY-UBM DE-91S DE-BY-TUM |
owner_facet | DE-355 DE-BY-UBR DE-20 DE-91G DE-BY-TUM DE-29 DE-M49 DE-BY-TUM DE-19 DE-BY-UBM DE-91S DE-BY-TUM |
physical | XXIII, 252 S. graph. Darst. |
publishDate | 2009 |
publishDateSearch | 2009 |
publishDateSort | 2009 |
publisher | Springer |
record_format | marc |
series2 | Use R! |
spelling | Foulkes, Andrea S. Verfasser aut Applied statistical genetics with R for population-based association studies Andrea S. Foulkes Dordrecht [u.a.] Springer 2009 XXIII, 252 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Use R! Automatic Data Processing Epidemiologic Methods Genetics, Population methods Models, Statistical Programming Languages Statistik (DE-588)4056995-0 gnd rswk-swf Biostatistik (DE-588)4729990-3 gnd rswk-swf Genetik (DE-588)4071711-2 gnd rswk-swf R Programm (DE-588)4705956-4 gnd rswk-swf Biostatistik (DE-588)4729990-3 s DE-604 Genetik (DE-588)4071711-2 s Statistik (DE-588)4056995-0 s R Programm (DE-588)4705956-4 s Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017612360&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Foulkes, Andrea S. Applied statistical genetics with R for population-based association studies Automatic Data Processing Epidemiologic Methods Genetics, Population methods Models, Statistical Programming Languages Statistik (DE-588)4056995-0 gnd Biostatistik (DE-588)4729990-3 gnd Genetik (DE-588)4071711-2 gnd R Programm (DE-588)4705956-4 gnd |
subject_GND | (DE-588)4056995-0 (DE-588)4729990-3 (DE-588)4071711-2 (DE-588)4705956-4 |
title | Applied statistical genetics with R for population-based association studies |
title_auth | Applied statistical genetics with R for population-based association studies |
title_exact_search | Applied statistical genetics with R for population-based association studies |
title_full | Applied statistical genetics with R for population-based association studies Andrea S. Foulkes |
title_fullStr | Applied statistical genetics with R for population-based association studies Andrea S. Foulkes |
title_full_unstemmed | Applied statistical genetics with R for population-based association studies Andrea S. Foulkes |
title_short | Applied statistical genetics with R |
title_sort | applied statistical genetics with r for population based association studies |
title_sub | for population-based association studies |
topic | Automatic Data Processing Epidemiologic Methods Genetics, Population methods Models, Statistical Programming Languages Statistik (DE-588)4056995-0 gnd Biostatistik (DE-588)4729990-3 gnd Genetik (DE-588)4071711-2 gnd R Programm (DE-588)4705956-4 gnd |
topic_facet | Automatic Data Processing Epidemiologic Methods Genetics, Population methods Models, Statistical Programming Languages Statistik Biostatistik Genetik R Programm |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017612360&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT foulkesandreas appliedstatisticalgeneticswithrforpopulationbasedassociationstudies |