Indices and Applications in High-Throughput Sequencing:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
2013
|
Schlagworte: | |
Online-Zugang: | Volltext Inhaltsverzeichnis |
Beschreibung: | Nebentitel: Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung |
Beschreibung: | XII, 196 S. Ill., graph. Darst. |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV041086973 | ||
003 | DE-604 | ||
005 | 20150408 | ||
007 | t | ||
008 | 130613s2013 gw ad|| m||| 00||| eng d | ||
035 | |a (OCoLC)863656176 | ||
035 | |a (DE-599)BVBBV041086973 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
044 | |a gw |c DE | ||
049 | |a DE-188 | ||
082 | 0 | |a 572.86330285 |2 22/ger | |
084 | |a 004 |2 FUB | ||
100 | 1 | |a Weese, David |d 1979- |e Verfasser |0 (DE-588)1035797615 |4 aut | |
245 | 1 | 0 | |a Indices and Applications in High-Throughput Sequencing |c vorgelegt von David Weese |
246 | 1 | 3 | |a Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung |
264 | 1 | |c 2013 | |
300 | |a XII, 196 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Nebentitel: Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung | ||
502 | |a Berlin, Freie Univ., Diss., 2013 | ||
650 | 0 | 7 | |a Indizierung |g Informatik |0 (DE-588)4385466-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a High throughput screening |0 (DE-588)4596131-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Volltext |0 (DE-588)4740819-4 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Volltext |0 (DE-588)4740819-4 |D s |
689 | 0 | 1 | |a Indizierung |g Informatik |0 (DE-588)4385466-7 |D s |
689 | 0 | 2 | |a High throughput screening |0 (DE-588)4596131-1 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |o urn:nbn:de:kobv:188-fudissthesis000000094456-3 |
856 | 4 | 1 | |u http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000094456 |z kostenfrei |3 Volltext |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026063646&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
912 | |a ebook | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-026063646 |
Datensatz im Suchindex
_version_ | 1804150461217898496 |
---|---|
adam_text | IMAGE 1
CONTENTS
PARTI INTRODUCTION 1 .
1. INTRODUCTION 3
1.1 PREFACE 3
1.2 SANGER SEQUENCING 4
1.3 HIGH-THROUGHPUT SEQUENCING TECHNOLOGIES 5
1.4 APPLICATIONS OF HIGH-THROUGHPUT SEQUENCING 6
1.5 OVERVIEW 7
1.5.1 INDEX DATA STRUCTURES 8
1.5.2 READ MAPPING 9
1.5.3 FREQUENCY STRING MINING 10
2. MATHEMATICAL PRELIMINARIES 13
2.1 NOTATIONS 13
2.2 RELATIONS 14
2.3 SUFFIX TREE 15
2.4 TRANSCRIPTS AND ALIGNMENTS 17
2.5 APPROXIMATE MATCHING 19
PART II INDEX DATA STRUCTURES 23
3. ENHANCED SUFFIX ARRAY 25
3.1 DEFINITIONS 25
3.1.1 SUFFIX ARRAY 25
3.1.2 LCP TABLE 25
3.1.3 CHILD TABLE 26
3.2 REPRESENTATION 29
3.3 CONSTRUCTION OF THE SUFFIX ARRAY 30
3.3.1 THE LINEAR-TIME ALGORITHM BY KARKKAINEN ET AL 31
3.3.2 DIFFERENCE COVERS 33
3.3.3 OUR ALGORITHMS 35
3.3.4 EXTERNAL MEMORY VARIANT 39
3.3.5 EXTENSION TO MULTIPLE SEQUENCES 40
3.4 CONSTRUCTION OF THE LCP TABLE 42
3.4.1 THE LINEAR-TIME ALGORITHM BY KASAI ET AL 43
3.4.2 SPACE-SAVING VARIANT 44
3.4.3 ADAPTATION TO EXTERNAL MEMORY 45
3.4.4 EXTENSION TO MULTIPLE SEQUENCES 46
HTTP://D-NB.INFO/1042805709
IMAGE 2
X
3.5 CONSTRUCTION OF THE CHILD TABLE 47
3.5.1 BOTTOM-UP SUFFIX TREE TRAVERSAL 47
3.5.2 THE LINEAR-TIME ALGORITHM BY ABOUELHODA ET AL. 47
3.5.3 ADAPTATION TO EXTERNAL MEMORY AND MULTIPLE SEQUENCES 49
3.6 APPLICATIONS 51
3.6.1 SEARCHING THE SUFFIX ARRAY 51
3.6.2 TRAVERSING THE SUFFIX TREE 53
3.6.3 ACCESSING THE SUFFIX TREE 57
3.6.4 REPEAT SEARCH 58
4. LAZY SUFFIX TREE 65
4.1 THE WOTD ALGORITHM 65
4.2 LAZY CONSTRUCTION AND REPRESENTATION 66
4.2.1 THE ORIGINAL DATA STRUCTURE 68
4.2.2 OUR DATA STRUCTURE 70
4.2.3 EXTENSION TO MULTIPLE SEQUENCES 73
4.3 APPLICATIONS 73
4.3.1 TRAVERSING AND ACCESSING THE LAZY SUFFIX TREE 74
4.3.2 RADIX TREES 76
4.3.3 MULTIPLE EXACT PATTERN SEARCH 76
4.3.4 APPROXIMATE PATTERN SEARCH 78
5. Q-GRAM INDEX 83
5.1 DEFINITIONS 83
5.2 THE DIRECT ADDRESSING Q-GRAM INDEX 84
5.3 CONSTRUCTION 84
5.3.1 COUNTING SORT ALGORITHM 85
5.3.2 EXTENSION TO MULTIPLE SEQUENCES 85
5.3.3 ADAPTATION TO EXTERNAL MEMORY 86
5.4 THE OPEN ADDRESSING Q-GRAM INDEX 86
5.5 APPLICATIONS 88
5.5.1 Q-GRAM COUNTING FILTERS FOR APPROXIMATE MATCHING 90
PART III APPLICATIONS 95
6. READ MAPPING 97
6.1 RELATED WORK 97
6.2 THE RAZERS ALGORITHM 100
6.3 DEFINITIONS 101
6.4 FILTRATION 102
6.4.1 SWIFT FILTER 102
6.4.2 PIGEONHOLE FILTER 103
6.5 LOSSY FILTRATION AND PREDICTION OF SENSITIVITY 104
6.5.1 SENSITIVITY CALCULATION OF Q-GRAM COUNTING FILTERS 105
6.5.2 SENSITIVITY CALCULATION OF PIGEONHOLE FILTERS 109
6.5.3 CHOOSING FILTRATION PARAMETERS 110
6.6 VERIFICATION ILL
IMAGE 3
XI
6.6.1 HAMMING DISTANCE VERIFICATION ILL
6.6.2 EDIT DISTANCE VERIFICATION 112
6.7 PAIRED-END MAPPING 117
6.8 MATCH PROCESSING 118
6.9 PARALLELIZATION 119
6.10 EXPERIMENTAL RESULTS 119
6.10.1 COMPARING THE SWIFT AND PIGEONHOLE FILTERS 120
6.10.2 ANALYZING THE SENSITIVITY ESTIMATION ACCURACY 121
6.10.3 ACHIEVED SPEEDUP 125
6.10.4 RABEMA BENCHMARK RESULTS 125
6.10.5 VARIANT DETECTION RESULTS 126
6.10.6 PERFORMANCE COMPARISON 127
7. FREQUENCY STRING MINING 131
7.1 RELATED WORK 131
7.2 DEFINITIONS 132
7.2.1 PREDICATES 133
7.2.2 MONOTONICITY 135
7.2.3 CONJUNCTIVE PREDICATES 136
7.3 MONOTONIC HULL 136
7.4 THE LINEAR-TIME ALGORITHM BY FISCHER ET AL 137
7.4.1 THE ORIGINAL ALGORITHM 137
7.4.2 SPACE EFFICIENT VARIANTS 138
7.5 A FAST ALGORITHM BASED ON LAZY SUFFIX TREES 139
7.5.1 THE DEFERRED FREQUENCY INDEX 139
7.5.2 ALGORITHMIC DETAILS 141
7.6 EXPERIMENTAL RESULTS 142
7.6.1 TWO DATABASES 144
7.6.2 MULTIPLE DATABASES 145
7.6.3 DETECTION OF SPECIES SPECIFIC PROTEIN DOMAINS 145
8. CONCLUSION AND FUTURE WORK 149
A. APPENDIX 153
A.L HIGH-THROUGHPUT SEQUENCING TECHNOLOGIES IN DETAIL 153
A.2 PROVING SENSITIVITY RECURSIONS 156
A.3 READ MAPPER PARAMETRIZATION 157
A.4 EXTENDED VARIATION DETECTION TABLES 158
A.5 EXTENDED PERFORMANCE COMPARISON TABLES 159
A.6 PROVING HULL OPTIMALITY 163
B. CURRICULUM VITAE 165
C. DECLARATION 169
BIBLIOGRAPHY 170
INDEX 195
|
any_adam_object | 1 |
author | Weese, David 1979- |
author_GND | (DE-588)1035797615 |
author_facet | Weese, David 1979- |
author_role | aut |
author_sort | Weese, David 1979- |
author_variant | d w dw |
building | Verbundindex |
bvnumber | BV041086973 |
collection | ebook |
ctrlnum | (OCoLC)863656176 (DE-599)BVBBV041086973 |
dewey-full | 572.86330285 |
dewey-hundreds | 500 - Natural sciences and mathematics |
dewey-ones | 572 - Biochemistry |
dewey-raw | 572.86330285 |
dewey-search | 572.86330285 |
dewey-sort | 3572.86330285 |
dewey-tens | 570 - Biology |
discipline | Biologie |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01938nam a2200457 c 4500</leader><controlfield tag="001">BV041086973</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20150408 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">130613s2013 gw ad|| m||| 00||| eng d</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)863656176</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV041086973</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">gw</subfield><subfield code="c">DE</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-188</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">572.86330285</subfield><subfield code="2">22/ger</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">004</subfield><subfield code="2">FUB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Weese, David</subfield><subfield code="d">1979-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1035797615</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Indices and Applications in High-Throughput Sequencing</subfield><subfield code="c">vorgelegt von David Weese</subfield></datafield><datafield tag="246" ind1="1" ind2="3"><subfield code="a">Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XII, 196 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Nebentitel: Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Berlin, Freie Univ., Diss., 2013</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Indizierung</subfield><subfield code="g">Informatik</subfield><subfield code="0">(DE-588)4385466-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">High throughput screening</subfield><subfield code="0">(DE-588)4596131-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Volltext</subfield><subfield code="0">(DE-588)4740819-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Volltext</subfield><subfield code="0">(DE-588)4740819-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Indizierung</subfield><subfield code="g">Informatik</subfield><subfield code="0">(DE-588)4385466-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">High throughput screening</subfield><subfield code="0">(DE-588)4596131-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="o">urn:nbn:de:kobv:188-fudissthesis000000094456-3</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000094456</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026063646&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ebook</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-026063646</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV041086973 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:39:17Z |
institution | BVB |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-026063646 |
oclc_num | 863656176 |
open_access_boolean | 1 |
owner | DE-188 |
owner_facet | DE-188 |
physical | XII, 196 S. Ill., graph. Darst. |
psigel | ebook |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
record_format | marc |
spelling | Weese, David 1979- Verfasser (DE-588)1035797615 aut Indices and Applications in High-Throughput Sequencing vorgelegt von David Weese Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung 2013 XII, 196 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Nebentitel: Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung Berlin, Freie Univ., Diss., 2013 Indizierung Informatik (DE-588)4385466-7 gnd rswk-swf High throughput screening (DE-588)4596131-1 gnd rswk-swf Volltext (DE-588)4740819-4 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Volltext (DE-588)4740819-4 s Indizierung Informatik (DE-588)4385466-7 s High throughput screening (DE-588)4596131-1 s DE-604 Erscheint auch als Online-Ausgabe urn:nbn:de:kobv:188-fudissthesis000000094456-3 http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000094456 kostenfrei Volltext DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026063646&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Weese, David 1979- Indices and Applications in High-Throughput Sequencing Indizierung Informatik (DE-588)4385466-7 gnd High throughput screening (DE-588)4596131-1 gnd Volltext (DE-588)4740819-4 gnd |
subject_GND | (DE-588)4385466-7 (DE-588)4596131-1 (DE-588)4740819-4 (DE-588)4113937-9 |
title | Indices and Applications in High-Throughput Sequencing |
title_alt | Indizes und Anwendungen in der Hochdurchsatz-Sequenzierung |
title_auth | Indices and Applications in High-Throughput Sequencing |
title_exact_search | Indices and Applications in High-Throughput Sequencing |
title_full | Indices and Applications in High-Throughput Sequencing vorgelegt von David Weese |
title_fullStr | Indices and Applications in High-Throughput Sequencing vorgelegt von David Weese |
title_full_unstemmed | Indices and Applications in High-Throughput Sequencing vorgelegt von David Weese |
title_short | Indices and Applications in High-Throughput Sequencing |
title_sort | indices and applications in high throughput sequencing |
topic | Indizierung Informatik (DE-588)4385466-7 gnd High throughput screening (DE-588)4596131-1 gnd Volltext (DE-588)4740819-4 gnd |
topic_facet | Indizierung Informatik High throughput screening Volltext Hochschulschrift |
url | http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000094456 http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026063646&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT weesedavid indicesandapplicationsinhighthroughputsequencing AT weesedavid indizesundanwendungeninderhochdurchsatzsequenzierung |