Algorithms in computational molecular biology: techniques, approaches and applications
Saved in:
Format: | Book |
---|---|
Language: | English |
Published: |
Hoboken, NJ
Wiley
2011
|
Series: | Wiley series on bioinformatics: Computational techniques and engineering
|
Subjects: | |
Online Access: | Inhaltsverzeichnis |
Item Description: | Literaturangaben |
Physical Description: | XXXVII, 1044 S. Ill., graph. Darst. |
ISBN: | 9780470505199 |
Staff View
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV037277083 | ||
003 | DE-604 | ||
005 | 20181130 | ||
007 | t | ||
008 | 110314s2011 ad|| |||| 00||| eng d | ||
020 | |a 9780470505199 |9 978-0-470-50519-9 | ||
035 | |a (OCoLC)732300785 | ||
035 | |a (DE-599)BVBBV037277083 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-11 |a DE-M49 |a DE-19 | ||
082 | 0 | |a 572.80285 | |
084 | |a WC 7700 |0 (DE-625)148144: |2 rvk | ||
084 | |a BIO 110f |2 stub | ||
084 | |a BIO 220f |2 stub | ||
084 | |a BIO 105f |2 stub | ||
245 | 1 | 0 | |a Algorithms in computational molecular biology |b techniques, approaches and applications |c ed. by Mourad Elloumi ... |
264 | 1 | |a Hoboken, NJ |b Wiley |c 2011 | |
300 | |a XXXVII, 1044 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Wiley series on bioinformatics: Computational techniques and engineering | |
500 | |a Literaturangaben | ||
650 | 0 | 7 | |a Molekularbiologie |0 (DE-588)4039983-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Algorithmus |0 (DE-588)4001183-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Bioinformatik |0 (DE-588)4611085-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Biomathematik |0 (DE-588)4139408-2 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4143413-4 |a Aufsatzsammlung |2 gnd-content | |
689 | 0 | 0 | |a Molekularbiologie |0 (DE-588)4039983-7 |D s |
689 | 0 | 1 | |a Biomathematik |0 (DE-588)4139408-2 |D s |
689 | 0 | 2 | |a Bioinformatik |0 (DE-588)4611085-9 |D s |
689 | 0 | 3 | |a Algorithmus |0 (DE-588)4001183-5 |D s |
689 | 0 | |C b |5 DE-604 | |
700 | 1 | |a Elloumi, Mourad |e Sonstige |4 oth | |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021189930&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-021189930 |
Record in the Search Index
_version_ | 1804143904014991360 |
---|---|
adam_text | IMAGE 1
CONTENTS IX
4.4 REPETITIVE STRUCTURES IN DEGENERATE STRINGS / 79
4.4.1 USING THE MASKING TECHNIQUE / 79 4.4.2 COMPUTING THE SMALLEST
COVER OF THE DEGENERATE STRING X I 79 4.4.3 COMPUTING MAXIMAL LOCAL
COVERS OF X I 81 4.4.4 COMPUTING ALL COVERS OF X I 84
4.4.5 COMPUTING THE SEEDS OF X I 84 4.5 CONSERVATIVE STRING COVERING IN
DEGENERATE STRINGS / 84 4.5.1 FINDING CONSTRAINED PATTERN P IN
DEGENERATE STRING T I 85
4.5.2 COMPUTING A-CONSERVATIVE COVERS OF DEGENERATE STRINGS / 86 4.5.3
COMPUTING ^-CONSERVATIVE SEEDS OF DEGENERATE STRINGS / 87 4.6 CONCLUSION
/ 88 REFERENCES / 89
EXACT SEARCH ALGORITHMS FOR BIOLOGICAL SEQUENCES 91
ERIC RIVALS, LEENA SALMELA, AND JORMA TARHIO
5.1 INTRODUCTION / 91 5.2 SINGLE PATTERN MATCHING ALGORITHMS / 93 5.2.1
ALGORITHMS FOR DNA SEQUENCES / 94 5.2.2 ALGORITHMS FOR AMINO ACIDS / 96
5.3 ALGORITHMS FOR MULTIPLE PATTERNS / 97 5.3.1 TRIE-BASED ALGORITHMS /
97 5.3.2 FILTERING ALGORITHMS / 100 5.3.3 OTHER ALGORITHMS / 103 5.4
APPLICATION OF EXACT SET PATTERN MATCHING FOR READ MAPPING / 103
5.4.1 MPSCAN: AN EFFICIENT EXACT SET PATTERN MATCHING TOOL FOR DNA/RNA
SEQUENCES / 103 5.4.2 OTHER SOLUTIONS FOR MAPPING READS / 104 5.4.3
COMPARISON OF MAPPING SOLUTIONS / 105 5.5 CONCLUSIONS / 107 REFERENCES /
108
6 ALGORITHMIC ASPECTS OF ARC-ANNOTATED SEQUENCES 113 GUILLAUME BLIN,
MAXIME CROCHEMORE, AND STEPHANE VIALETTE
6.1 INTRODUCTION / 113 6.2 PRELIMINARIES / 114 6.2.1 ARC-ANNOTATED
SEQUENCES / 114 6.2.2 HIERARCHY / 114
6.2.3 REFINED HIERARCHY / 115
IMAGE 2
ALGORITHMS IN
COMPUTATIONAL MOLECULAR BIOLOGY TECHNIQUES, APPROACHES AND APPLICATIONS
EDITED BY
MOURAD ELLOUMI
UNIT OF TECHNOLOGIES OF INFORMATION AND COMMUNICATION AND UNIVERSITY OF
TUNIS-EL MANAR, TUNISIA
ALBERT Y. ZOMAYA
THE UNIVERSITY OF SYDNEY, AUSTRALIA
WILEY
A JOHN WILEY & SONS, INC., PUBLICATION
IMAGE 3
CONTENTS
PREFACE XXXI
CONTRIBUTORS XXXIII
I STRINGS PROCESSING AND APPLICATION TO BIOLOGICAL SEQUENCES 1
1 STRING DATA STRUCTURES FOR COMPUTATIONAL MOLECULAR BIOLOGY 3
CHRISTOS MAKRIS AND EVANGELOS THEODORIDIS
1.1 INTRODUCTION / 3 1.2 MAIN STRING INDEXING DATA STRUCTURES / 6 1.2.1
SUFFIX TREES / 6 1.2.2 SUFFIX ARRAYS / 8 1.3 INDEX STRUCTURES FOR
WEIGHTED STRINGS / 12
1.4 INDEX STRUCTURES FOR INDETERMINATE STRINGS / 14 1.5 STRING DATA
STRUCTURES IN MEMORY HIERARCHIES / 17 1.6 CONCLUSIONS / 20 REFERENCES /
20
2 EFFICIENT RESTRICTED-CASE ALGORITHMS FOR PROBLEMS IN COMPUTATIONAL
BIOLOGY 27
PATRICIA A. EVANS AND H. TODD WAREHAM
2.1 THE NEED FOR SPECIAL CASES / 27 2.2 ASSESSING EFFICIENT SOLVABILITY
OPTIONS FOR GENERAL PROBLEMS AND SPECIAL CASES / 28 2.3 STRING AND
SEQUENCE PROBLEMS / 30
2.4 SHORTEST COMMON SUPERSTRING / 31 2.4.1 SOLVING THE GENERAL PROBLEM /
32 2.4.2 SPECIAL CASE: SCST FOR SHORT STRINGS OVER SMALL ALPHABETS / 34
2.4.3 DISCUSSION / 35
VII
IMAGE 4
VMI CONTENTS
2.5 LONGEST COMMON SUBSEQUENCE / 36
2.5.1 SOLVING THE GENERAL PROBLEM / 37 2.5.2 SPECIAL CASE: LCS OF
SIMILAR SEQUENCES / 39 2.5.3 SPECIAL CASE: LCS UNDER SYMBOL-OCCURRENCE
RESTRICTIONS / 39 2.5.4 DISCUSSION / 40 2.6 COMMON APPROXIMATE SUBSTRING
/ 41
2.6.1 SOLVING THE GENERAL PROBLEM / 42 2.6.2 SPECIAL CASE: COMMON
APPROXIMATE STRING / 44 2.6.3 DISCUSSION / 45 2.7 CONCLUSION / 46
REFERENCES / 47
3 FINITE AUTOMATA IN PATTERN MATCHING 51
JAN HOLUB
3.1 INTRODUCTION / 51 3.1.1 PRELIMINARIES / 52 3.2 DIRECT USE OF DFA IN
STRINGOLOGY / 53 3.2.1 FORWARD AUTOMATA / 53
3.2.2 DEGENERATE STRINGS / 56 3.2.3 INDEXING AUTOMATA / 57 3.2.4
FILTERING AUTOMATA / 59 3.2.5 BACKWARD AUTOMATA / 59
3.2.6 AUTOMATA WITH FAIL FUNCTION / 60 3.3 NFA SIMULATION / 60 3.3.1
BASIC SIMULATION METHOD / 61
3.3.2 BIT PARALLELISM / 61 3.3.3 DYNAMIC PROGRAMMING / 63 3.3.4 BASIC
SIMULATION METHOD WITH DETERMINISTIC STATE CACHE / 66 3.4 FINITE
AUTOMATON AS MODEL OF COMPUTATION / 66 3.5 FINITE AUTOMATA COMPOSITION /
67 3.6 SUMMARY / 67 REFERENCES / 69
4 NEW DEVELOPMENTS IN PROCESSING OF DEGENERATE SEQUENCES 73
PAVLOS ANTONIOU AND COSTAS S. ILIOPOULOS
4.1 INTRODUCTION / 73
4.1.1 DEGENERATE PRIMER DESIGN PROBLEM / 74 4.2 BACKGROUND / 74 4.3
BASIC DEFINITIONS / 76
IMAGE 5
CONTENTS IX
4.4 REPETITIVE STRUCTURES IN DEGENERATE STRINGS / 79
4.4.1 USING THE MASKING TECHNIQUE / 79 4.4.2 COMPUTING THE SMALLEST
COVER OF THE DEGENERATE STRING X I 79 4.4.3 COMPUTING MAXIMAL LOCAL
COVERS OF X I 81 4.4.4 COMPUTING ALL COVERS OF X I 84
4.4.5 COMPUTING THE SEEDS OF X I 84 4.5 CONSERVATIVE STRING COVERING IN
DEGENERATE STRINGS / 84 4.5.1 FINDING CONSTRAINED PATTERN P IN
DEGENERATE STRING T I 85 4.5.2 COMPUTING ^.-CONSERVATIVE COVERS OF
DEGENERATE STRINGS / 86
4.5.3 COMPUTING ^-CONSERVATIVE SEEDS OF DEGENERATE STRINGS / 87 4.6
CONCLUSION / 88 REFERENCES / 89
EXACT SEARCH ALGORITHMS FOR BIOLOGICAL SEQUENCES 91
ERIC RIVALS, LEENA SALMELA, AND JORMA TARHIO
5.1 INTRODUCTION / 91 5.2 SINGLE PATTERN MATCHING ALGORITHMS / 93 5.2.1
ALGORITHMS FOR DNA SEQUENCES / 94 5.2.2 ALGORITHMS FOR AMINO ACIDS / 96
5.3 ALGORITHMS FOR MULTIPLE PATTERNS / 97 5.3.1 TRIE-BASED ALGORITHMS /
97 5.3.2 FILTERING ALGORITHMS / 100 5.3.3 OTHER ALGORITHMS / 103 5.4
APPLICATION OF EXACT SET PATTERN MATCHING FOR READ MAPPING / 103
5.4.1 MPSCAN: AN EFFICIENT EXACT SET PATTERN MATCHING TOOL FOR DNA/RNA
SEQUENCES / 103 5.4.2 OTHER SOLUTIONS FOR MAPPING READS / 104 5.4.3
COMPARISON OF MAPPING SOLUTIONS / 105 5.5 CONCLUSIONS / 107 REFERENCES /
108
6 ALGORITHMIC ASPECTS OF ARC-ANNOTATED SEQUENCES 113 GUILLAUME BLIN,
MAXIME CROCHEMORE, AND STEPHANE VIALETTE
6.1 INTRODUCTION / 113 6.2 PRELIMINARIES / 114 6.2.1 ARC-ANNOTATED
SEQUENCES / 114 6.2.2 HIERARCHY / 114
6.2.3 REFINED HIERARCHY / 115
IMAGE 6
CONTENTS
6.2.4 ALIGNMENT / 115
6.2.5 EDIT OPERATIONS / 116 6.3 LONGEST ARC-PRESERVING COMMON
SUBSEQUENCE / 117 6.3.1 DEFINITION / 117
6.3.2 CLASSICAL COMPLEXITY / 118 6.3.3 PARAMETERIZED COMPLEXITY / 119
6.3.4 APPROXIMABILITY / 120 6.4 ARC-PRESERVING SUBSEQUENCE / 120 6.4.1
DEFINITION / 120 6.4.2 CLASSICAL COMPLEXITY / 121 6.4.3 CLASSICAL
COMPLEXITY FOR THE REFINED HIERARCHY / 121
6.4.4 OPEN PROBLEMS / 122 6.5 MAXIMUM ARC-PRESERVING COMMON SUBSEQUENCE
/ 122 6.5.1 DEFINITION / 122
6.5.2 CLASSICAL COMPLEXITY / 123 6.5.3 OPEN PROBLEMS / 123 6.6 EDIT
DISTANCE / 123 6.6.1 DEFINITION / 123
6.6.2 CLASSICAL COMPLEXITY / 123 6.6.3 APPROXIMABILITY / 125 6.6.4 OPEN
PROBLEMS / 125 REFERENCES / 125
7 ALGORITHMIC ISSUES IN DNA BARCODING PROBLEMS 129 BHASKAR DASGUPTA,
MING-YANG KAO, AND ION MANDOIU
7.1 INTRODUCTION / 129 7.2 TEST SET PROBLEMS: A GENERAL FRAMEWORK FOR
SEVERAL BARCODING PROBLEMS / 130 7.3 A SYNOPSIS OF BIOLOGICAL
APPLICATIONS OF BARCODING / 132
7.4 SURVEY OF ALGORITHMIC TECHNIQUES ON BARCODING / 133 7.4.1 INTEGER
PROGRAMMING / 134 7.4.2 LAGRANGIAN RELAXATION AND SIMULATED ANNEALING /
134 7.4.3 PROVABLY ASYMPTOTICALLY OPTIMAL RESULTS / 134
7.5 INFORMATION CONTENT APPROACH / 135 7.6 SET-COVERING APPROACH / 136
7.6.1 SET-COVERING IMPLEMENTATION IN MORE DETAIL / 137 7.7 EXPERIMENTAL
RESULTS AND SOFTWARE AVAILABILITY / 139
7.7.1 RANDOMLY GENERATED INSTANCES / 139 7.7.2 REAL DATA / 140 7.7.3
SOFTWARE AVAILABILITY / 140 7.8 CONCLUDING REMARKS / 140 REFERENCES /
141
IMAGE 7
CONTENTS XI
8 RECENT ADVANCES IN WEIGHTED DNA SEQUENCES 143
MANOLIS CHRISTODOULAKIS AND COSTAS S. ILIOPOULOS
8.1 INTRODUCTION / 143 8.2 PRELIMINARIES / 146 8.2.1 STRINGS / 146 8.2.2
WEIGHTED SEQUENCES / 147
8.3 INDEXING / 148 8.3.1 WEIGHTED SUFFIX TREE / 148 8.3.2 PROPERTY
SUFFIX TREE / 151 8.4 PATTERN MATCHING / 152
8.4.1 PATTERN MATCHING USING THE WEIGHTED SUFFIX TREE / 152 8.4.2
PATTERN MATCHING USING MATCH COUNTS / 153 8.4.3 PATTERN MATCHING WITH
GAPS / 154 8.4.4 PATTERN MATCHING WITH SWAPS / 156 8.5 APPROXIMATE
PATTERN MATCHING / 157
8.5.1 HAMMING DISTANCE / 157 8.6 REPETITIONS, COVERS, AND TANDEM REPEATS
/ 160 8.6.1 FINDING SIMPLE REPETITIONS WITH THE WEIGHTED SUFFIX TREE /
161 8.6.2 FIXED-LENGTH SIMPLE REPETITIONS / 161
8.6.3 FIXED-LENGTH STRICT REPETITIONS / 163 8.6.4 FIXED-LENGTH TANDEM
REPEATS / 163 8.6.5 IDENTIFYING COVERS / 164 8.7 MOTIF DISCOVERY / 164
8.7.1 APPROXIMATE MOTIFS IN A SINGLE WEIGHTED SEQUENCE / 164 8.7.2
APPROXIMATE COMMON MOTIFS IN A SET OF WEIGHTED SEQUENCES / 165 8.8
CONCLUSIONS / 166 REFERENCES / 167
9 DNA COMPUTING FOR SUBGRAPH ISOMORPHISM PROBLEM AND RELATED PROBLEMS
171
SUN-YUAN HSIEH, CHAO-WEN HUANG, AND HSIN-HUNG CHOU
9.1 INTRODUCTION / 171 9.2 DEFINITIONS OF SUBGRAPH ISOMORPHISM PROBLEM
AND RELATED PROBLEMS / 172 9.3 DNA COMPUTING MODELS / 174
9.3.1 THE STICKERS / 174 9.3.2 THE ADLEMAN-LIPTON MODEL / 175 9.4 THE
STICKER-BASED SOLUTION SPACE / 175 9.4.1 USING STICKERS FOR GENERATING
THE PERMUTATION SET / 176
9.4.2 USING STICKERS FOR GENERATING THE SOLUTION SPACE / 177
IMAGE 8
XM CONTENTS
9.5 ALGORITHMS FOR SOLVING PROBLEMS / 179
9.5.1 SOLVING THE SUBGRAPH ISOMORPHISM PROBLEM / 179 9.5.2 SOLVING THE
GRAPH ISOMORPHISM PROBLEM / 183 9.5.3 SOLVING THE MAXIMUM COMMON
SUBGRAPH PROBLEM / 184 9.6 EXPERIMENTAL DATA / 187 9.7 CONCLUSION / 188
REFERENCES / 188
II ANALYSIS OF BIOLOGICAL SEQUENCES 191
10 GRAPHS IN BIOINFORMATICS 193
ELSA CHACKO AND SHOBA RANGANATHAN
10.1 GRAPH THEORY-ORIGIN / 193
10.1.1 WHAT IS A GRAPH? / 193 10.1.2 TYPES OF GRAPHS / 194 10.1.3
WELL-KNOWN GRAPH PROBLEMS AND ALGORITHMS / 200 10.2 GRAPHS AND THE
BIOLOGICAL WORLD / 207 10.2.1 ALTERNATIVE SPLICING AND GRAPHS / 207
10.2.2 EVOLUTIONARY TREE CONSTRUCTION / 208
10.2.3 TRACKING THE TEMPORAL VARIATION OF BIOLOGICAL SYSTEMS / 209
10.2.4 IDENTIFYING PROTEIN DOMAINS BY CLUSTERING SEQUENCE ALIGNMENTS /
210 10.2.5 CLUSTERING GENE EXPRESSION DATA / 211 10.2.6 PROTEIN
STRUCTURAL DOMAIN DECOMPOSITION / 212 10.2.7 OPTIMAL DESIGN OF THERMALLY
STABLE PROTEINS / 212 10.2.8 THE SEQUENCING BY HYBRIDIZATION (SBH)
PROBLEM / 214 10.2.9 PREDICTING INTERACTIONS IN PROTEIN NETWORKS BY
COMPLETING DEFECTIVE CLIQUES / 215 10.3 CONCLUSION / 216 REFERENCES /
216
11 A FLEXIBLE DATA STORE FOR MANAGING BIOINFORMATICS DATA 221
BASSAMA. ALQARALLEH, CHEN WANG, BING BING ZHOU, AND ALBERT Y. ZOMAYA
11.1 INTRODUCTION / 221
11.1.1 BACKGROUND / 222 11.1.2 SCALABILITY CHALLENGES / 222 11.2 DATA
MODEL AND SYSTEM OVERVIEW / 223
IMAGE 9
CONTENTS XIII
11.3 REPLICATION AND LOAD BALANCING / 227
11.3.1 REPLICATING AN INDEX NODE / 228 11.3.2 ANSWERING RANGE QUERIES
WITH REPLICAS / 229 11.4 EVALUATION / 230 11.4.1 POINT QUERY PROCESSING
PERFORMANCE / 230
11.4.2 RANGE QUERY PROCESSING PERFORMANCE / 233 11.4.3 GROWTH OF THE
REPLICAS OF AN INDEXING NODE / 235 11.5 RELATED WORK / 237 11.6 SUMMARY
/ 237 REFERENCES / 238
12 ALGORITHMS FOR THE ALIGNMENT OF BIOLOGICAL SEQUENCES 241
AHMED MOKADDEM AND MOURAD ELLOUMI
12.1 INTRODUCTION / 241 12.2 ALIGNMENT ALGORITHMS / 242 12.2.1 PAIRWISE
ALIGNMENT ALGORITHMS / 242 12.2.2 MULTIPLE ALIGNMENT ALGORITHMS / 245
12.3 SCORE FUNCTIONS / 251 12.4 BENCHMARKS / 252 12.5 CONCLUSION / 255
ACKNOWLEDGMENTS / 255 REFERENCES / 255
13 ALGORITHMS FOR LOCAL STRUCTURAL ALIGNMENT AND STRUCTURAL MOTIF
IDENTIFICATION 261
SANGUTHEVAR RAJASEKARAN, VAMSI KUNDETI, AND MARTIN SCHILLER
13.1 INTRODUCTION / 261 13.2 PROBLEM DEFINITION OF LOCAL STRUCTURAL
ALIGNMENT / 262 13.3 VARIABLE-LENGTH ALIGNMENT FRAGMENT PAIR (VLAFP)
ALGORITHM / 263 13.3.1 ALIGNMENT FRAGMENT PAIRS / 263
13.3.2 FINDING THE OPTIMAL LOCAL ALIGNMENTS BASED ON THE VLAFP COST
FUNCTION / 264 13.4 STRUCTURAL ALIGNMENT BASED ON CENTER OF GRAVITY:
SACG / 266 13.4.1 DESCRIPTION OF PROTEIN STRUCTURE IN PDB FORMAT / 266
13.4.2 RELATED WORK / 267 13.4.3 CENTER-OF-GRAVITY-BASED ALGORITHM / 267
13.4.4 EXTENDING THEOREM 13.1 FOR ATOMIC COORDINATES IN PROTEIN
STRUCTURE / 269
13.4.5 BUILDING VCOST(I,J,Q) FUNCTION BASED ON CENTER OF GRAVITY / 270
IMAGE 10
XIV CONTENTS
13.5 SEARCHING STRUCTURAL MOTIFS / 270
13.6 USING SACG ALGORITHM FOR CLASSIFICATION OF NEW PROTEIN STRUCTURES /
273 13.7 EXPERIMENTAL RESULTS / 273 13.8 ACCURACY RESULTS / 273
13.9 CONCLUSION / 274 ACKNOWLEDGMENTS / 275 ;
REFERENCES / 276
14 EVOLUTION OF THE CLUSTAL FAMILY OF.MULTIPLE SEQUENCE ALIGNMENT
PROGRAMS 277
MOHAMED RADHOUENE ANIBA AND JULIE THOMPSON
14.1 INTRODUCTION / 277 14.2 CLUSTAL-CLUSTALV / 278 14.2.1 PAIRWISE
SIMILARITY SCORES / 279 14.2.2 GUIDE TREE / 280
14.2.3 PROGRESSIVE MULTIPLE ALIGNMENT / 282 14.2.4 AN EFFICIENT DYNAMIC
PROGRAMMING ALGORITHM / 282 14.2.5 PROFILE ALIGNMENTS / 284 14.3
CLUSTALW / 284
14.3.1 OPTIMAL PAIRWISE ALIGNMENTS / 284 14.3.2 MORE ACCURATE GUIDE TREE
/ 284 14.3.3 IMPROVED PROGRESSIVE ALIGNMENT / 285 14.4 CLUSTALX / 289
14.4.1 ALIGNMENT QUALITY ANALYSIS / 290 14.5 CLUSTALW AND CLUSTALX 2.0 /
292 14.6 DBCLUSTAL / 293 14.6.1 ANCHORED GLOBAL ALIGNMENT / 294
14.7 PERSPECTIVES / 295 REFERENCES / 296
15 FILTERS AND SEEDS APPROACHES FOR FAST HOMOLOGY SEARCHES IN LARGE
DATASETS 299
NADIA PISANTI, MATHIEU GIRAUD, AND PIERRE PETERLONGO
15.1 INTRODUCTION / 299
15.1.1 HOMOLOGIES AND LARGE DATASETS / 299 15.1.2 FILTER PREPROCESSING
OR HEURISTICS / 300 15.1.3 CONTENTS / 300 15.2 METHODS FRAMEWORK / 301
15.2.1 STRINGS AND REPEATS / 301 15.2.2 FILTERS-FUNDAMENTAL CONCEPTS /
301
IMAGE 11
CONTENTS XV
15.3 LOSSLESS FILTERS / 303
15.3.1 HISTORY OF LOSSLESS FILTERS / 303 15.3.2 QUASAR AND
SWIFT-FILTERING REPEATS WITH EDIT DISTANCE / 304 15.3.3 NIMBUS-FILTERING
MULTIPLE REPEATS WITH HAMMING
DISTANCE / 305
15.3.4 TUIUIU-FILTERING MULTIPLE REPEATS WITH EDIT DISTANCE / 308 15.4
LOSSY SEED-BASED FILTERS / 309 15.4.1 SEED-BASED HEURISTICS / 310 15.4.2
ADVANCED SEEDS / 311
15.4.3 LATENCIES AND NEIGHBORHOOD INDEXING / 311 15.4.4 SEED-BASED
HEURISTICS IMPLEMENTATIONS / 313 15.5 CONCLUSION / 315 15.6
ACKNOWLEDGMENTS / 315 REFERENCES / 315
16 NOVEL COMBINATORIAL AND INFORMATION-THEORETIC ALIGNMENT-FREE
DISTANCES FOR BIOLOGICAL DATA MINING 321
CHIARA EPIFANIO, ALESSANDRA GABRIELE, RAFFAELE GIANCARLO, AND MARINELLA
SCIORTINO
16.1 INTRODUCTION / 321 16.2 INFORMATION-THEORETIC ALIGNMENT-FREE
METHODS / 323 16.2.1 FUNDAMENTAL INFORMATION MEASURES, STATISTICAL
DEPENDENCY, AND SIMILARITY OF SEQUENCES / 324
16.2.2 METHODS BASED ON RELATIVE ENTROPY AND EMPIRICAL PROBABILITY
DISTRIBUTIONS / 325 16.2.3 A METHOD BASED ON STATISTICAL DEPENDENCY, VIA
MUTUAL INFORMATION / 329 16.3 COMBINATORIAL ALIGNMENT-FREE METHODS / 331
16.3.1 THE AVERAGE COMMON SUBSTRING DISTANCE / 332 16.3.2 A METHOD BASED
ON THE EBWT TRANSFORM / 333 16.3.3 A?-LOCAL DECODING / 334 16.4
ALIGNMENT-FREE COMPOSITIONAL METHODS / 336
16.4.1 THE ^-STRING COMPOSITION APPROACH / 337 16.4.2 COMPLETE
COMPOSITION VECTOR / 338 16.4.3 FAST ALGORITHMS TO COMPUTE COMPOSITION
VECTORS / 339 16.5 ALIGNMENT-FREE EXACT WORD MATCHES METHODS / 340
16.5.1 D 2 AND ITS DISTRIBUTIONAL REGIMES / 340 16.5.2 AN EXTENSION TO
MISMATCHES AND THE CHOICE OF THE OPTIMAL WORD SIZE / 342
16.5.3 THE TRANSFORMATION OF D 2 INTO A METHOD ASSESSING THE STATISTICAL
SIGNIFICANCE OF SEQUENCE SIMILARITY / 343
IMAGE 12
XVI CONTENTS
16.6 DOMAINS OF BIOLOGICAL APPLICATION / 344
16.6.1 PHYLOGENY: INFORMATION THEORETIC AND COMBINATORIAL METHODS / 345
16.6.2 PHYLOGENY: COMPOSITIONAL METHODS / 346 16.6.3 CIS REGULATORY
MODULES / 347 16.6.4 DNA SEQUENCE DEPENDENCIES / 348 16.7 DATASETS AND
SOFTWARE FOR EXPERIMENTAL ALGORITHMICS / 349
16.7.1 DATASETS / 350 16.7.2 SOFTWARE / 353 16.8 CONCLUSIONS / 354
REFERENCES / 355
17 IN SILICO METHODS FOR THE ANALYSIS OF METABOLITES AND DRUG MOLECULES
361
VARUN KHANNA AND SHOBA RANGANATHAN
17.1 INTRODUCTION / 361
17.1.1 CHEMOINFORMATICS AND DRUG-LIKENESS / 361 17.2 MOLECULAR
DESCRIPTORS / 363 17.2.1 ONE-DIMENSIONAL (1-D) DESCRIPTORS / 363
17.2.2 TWO-DIMENSIONAL (2-D) DESCRIPTORS / 364 17.2.3 THREE-DIMENSIONAL
(3-D) DESCRIPTORS / 366 17.3 DATABASES / 367 17.3.1 PUBCHEM / 367
17.3.2 CHEMICAL ENTITIES OF BIOLOGICAL INTEREST (CHEBI) / 369 17.3.3
CHEMBANK / 369 17.3.4 CHEMLDPLUS / 369 17.3.5 CHEMDB / 369 17.4 METHODS
AND DATA ANALYSIS ALGORITHMS / 370
17.4.1 SIMPLE COUNT METHODS / 370 17.4.2 ENHANCED SIMPLE COUNT METHODS,
USING STRUCTURAL FEATURES / 371 17.4.3 ML METHODS / 372 17.5 CONCLUSIONS
/ 376 ACKNOWLEDGMENTS / 377 REFERENCES / 377
III MOTIF FINDING AND STRUCTURE PREDICTION 383
18 MOTIF FINDING ALGORITHMS IN BIOLOGICAL SEQUENCES 385 TAREK EL FALAH,
MOURAD ELLOUMI, AND THIERRY LECROQ
18.1 INTRODUCTION / 385
IMAGE 13
CONTENTS XVII
18.2 PRELIMINARIES / 386
18.3 THE PLANTED (/, D )-MOTIF PROBLEM / 387 18.3.1 FORMULATION / 387
18.3.2 ALGORITHMS / 387
18.4 THE EXTENDED (/, D )-MOTIF PROBLEM / 391 18.4.1 FORMULATION / 391
18.4.2 ALGORITHMS / 391 .
18.5 THE EDITED MOTIF PROBLEM / 3 92 18.5.1 FORMULATION / 392 18.5.2
ALGORITHMS / 393 18.6 THE SIMPLE MOTIF PROBLEM / 393
18.6.1 FORMULATION / 393 18.6.2 ALGORITHMS / 394 18.7 CONCLUSION / 395
REFERENCES / 396
19 COMPUTATIONAL CHARACTERIZATION OF REGULATORY REGIONS 397
ENRIQUE BLANCO
19.1 THE GENOME REGULATORY LANDSCAPE / 397 19.2 QUALITATIVE MODELS OF
REGULATORY SIGNALS / 400 19.3 QUANTITATIVE MODELS OF REGULATORY SIGNALS
/ 401 19.4 DETECTION OF DEPENDENCIES IN SEQUENCES / 403
19.5 REPOSITORIES OF REGULATORY INFORMATION / 405 19.6 USING PREDICTIVE
MODELS TO ANNOTATE SEQUENCES / 406 19.7 COMPARATIVE GENOMICS
CHARACTERIZATION / 408 19.8 SEQUENCE COMPARISONS / 410
19.9 COMBINING MOTIFS AND ALIGNMENTS / 412 19.10 EXPERIMENTAL VALIDATION
/ 414 19.11 SUMMARY / 417 REFERENCES / 417
20 ALGORITHMIC ISSUES IN THE ANALYSIS OF CHIP-SEQ DATA 425 FEDERICO
ZAMBELLI AND GIULIO PAVESI
20.1 INTRODUCTION / 425 20.2 MAPPING SEQUENCES ON THE GENOME / 429 20.3
IDENTIFYING SIGNIFICANTLY ENRICHED REGIONS / 434 20.3.1 CHLP-SEQ
APPROACHES TO THE IDENTIFICATION OF DNA
STRUCTURE MODIFICATIONS / 437 20.4 DERIVING ACTUAL TRANSCRIPTION FACTOR
BINDING SITES / 438
IMAGE 14
XVIII CONTENTS
20.5 CONCLUSIONS / 444
REFERENCES / 444
21 APPROACHES AND METHODS FOR OPERON PREDICTION BASED ON MACHINE
LEARNING TECHNIQUES 449 YAN WANG, YOU ZHOU, CHUNGUANG ZHOU, SHUQIN WANG,
WEI DU, CHEN ZHANG, AND YANCHUN LIANG
21.1 INTRODUCTION / 449 21.2 DATASETS, FEATURES, AND PREPROCESSES FOR
OPERON PREDICTION / 451 21.2.1 OPERON DATASETS / 451 21.2.2 FEATURES /
454
21.2.3 PREPROCESS METHODS / 459 21.3 MACHINE LEARNING PREDICTION METHODS
FOR OPERON PREDICTION / 460 21.3.1 HIDDEN MARKOV MODEL / 461
21.3.2 LINKAGE CLUSTERING / 462 21.3.3 BAYESIAN CLASSIFIER / 464 21.3.4
BAYESIAN NETWORK / 467 21.3.5 SUPPORT VECTOR MACHINE / 468 21.3.6
ARTIFICIAL NEURAL NETWORK / 470
21.3.7 GENETIC ALGORITHMS / 471 21.3.8 SEVERAL COMBINATIONS / 472 21.4
CONCLUSIONS / 474 21.5 ACKNOWLEDGMENTS / 475 REFERENCES / 475
22 PROTEIN FUNCTION PREDICTION WITH DATA-MINING TECHNIQUES 479
XING-MING ZHAO AND LUONAN CHEN
22.1 INTRODUCTION / 479 22.2 PROTEIN ANNOTATION BASED ON SEQUENCE / 480
22.2.1 PROTEIN SEQUENCE CLASSIFICATION / 480 22.2.2 PROTEIN SUBCELLULAR
LOCALIZATION PREDICTION / 483
22.3 PROTEIN ANNOTATION BASED ON PROTEIN STRUCTURE / 484 22.4 PROTEIN
FUNCTION PREDICTION BASED ON GENE-EXPRESSION DATA / 485 22.5 PROTEIN
FUNCTION PREDICTION BASED ON PROTEIN INTERACTOME MAP / 486 22.5.1
PROTEIN FUNCTION PREDICTION BASED ON LOCAL TOPOLOGY
STRUCTURE OF INTERACTION MAP / 486 22.5.2 PROTEIN FUNCTION PREDICTION
BASED ON GLOBAL TOPOLOGY OF INTERACTION MAP / 488
IMAGE 15
CONTENTS XJX
22.6 PROTEIN FUNCTION PREDICTION BASED ON DATA INTEGRATION / 489
22.7 CONCLUSIONS AND PERSPECTIVES / 491 REFERENCES / 493
23 PROTEIN DOMAIN BOUNDARY PREDICTION 501 PAUL D. YOO, BING BING ZHOU,
AND ALBERT Y. ZOMAYA
23.1 INTRODUCTION / 501
23.2 PROFILING TECHNIQUE / 503 23.2.1 NONLOCAL INTERACTION AND VANISHING
GRADIENT PROBLEM / 506 23.2.2 HIERARCHICAL MIXTURE OF EXPERTS / 506
23.2.3 OVERALL MODULAR KERNEL ARCHITECTURE / 508 23.3 RESULTS / 510 23.4
DISCUSSION / 512
23.4.1 NONLOCAL INTERACTIONS IN AMINO ACIDS / 512 23.4.2 SECONDARY
STRUCTURE INFORMATION / 513 23.4.3 HYDROPHOBICITY AND PROFILES / 514
23.4.4 DOMAIN ASSIGNMENT IS MORE ACCURATE FOR PROTEINS WITH
FEWER DOMAINS / 514
23.5 CONCLUSIONS / 515 REFERENCES / 515
24 AN INTRODUCTION TO RNA STRUCTURE AND PSEUDOKNOT PREDICTION 521
JANA SPERSCHNEIDER AND AMITAVA DATTA
24.1 INTRODUCTION / 521 24.2 RNA SECONDARY STRUCTURE PREDICTION / 522
24.2.1 MINIMUM FREE ENERGY MODEL / 524 24.2.2 PREDICTION OF MINIMUM FREE
ENERGY STRUCTURE / 526
24.2.3 PARTITION FUNCTION CALCULATION / 530 24.2.4 BASE PAIR
PROBABILITIES / 533 24.3 RNA PSEUDOKNOTS / 534 24.3.1 BIOLOGICAL
RELEVANCE / 536
24.3.2 RNA PSEUDOKNOT PREDICTION / 537 24.3.3 DYNAMIC PROGRAMMING / 538
24.3.4 HEURISTIC APPROACHES / 541 24.3.5 PSEUDOKNOT DETECTION / 542
24.3.6 OVERVIEW / 542 24.4 CONCLUSIONS / 543 REFERENCES / 544
IMAGE 16
XX CONTENTS
IV PHYLOGENY RECONSTRUCTION 547
25 PHYLOGENETIC SEARCH ALGORITHMS FOR MAXIMUM LIKELIHOOD 549
ALEXANDROS STAMATAKIS
25.1 INTRODUCTION / 549
25.1.1 PHYLOGENETIC INFERENCE / 550 25.2 COMPUTING THE LIKELIHOOD / 552
25.3 ACCELERATING THE PLF BY ALGORITHMIC MEANS / 555
25.3.1 REUSE OF VALUES ACROSS PROBABILITY VECTORS / 555 25.3.2 GAPPY
ALIGNMENTS AND POINTER MESHES / 557 25.4 ALIGNMENT SHAPES / 558 25.5
GENERAL SEARCH HEURISTICS / 559 25.5.1 LAZY EVALUATION STRATEGIES / 563
25.5.2 FURTHER HEURISTICS / 564 25.5.3 RAPID BOOTSTRAPPING / 565 25.6
COMPUTING THE ROBINSON FOULDS DISTANCE / 566 25.7 CONVERGENCE CRITERIA /
568 25.7.1 ASYMPTOTIC STOPPING / 569 25.8 FUTURE DIRECTIONS / 572
REFERENCES / 573
26 HEURISTIC METHODS FOR PHYLOGENETIC RECONSTRUCTION WITH MAXIMUM
PARSIMONY 579 ADRIEN GOEFFON, JEAN-MICHEL RICHER, AND JIN-KAO HAO
26.1 INTRODUCTION / 579 26.2 DEFINITIONS AND FORMAL BACKGROUND / 580
26.2.1 PARSIMONY AND MAXIMUM PARSIMONY / 580 26.3 METHODS / 581
26.3.1 COMBINATORIAL OPTIMIZATION / 581 26.3.2 EXACT APPROACH / 582
26.3.3 LOCAL SEARCH METHODS / 582
263 A EVOLUTIONARY METAHEURISTICS AND GENETIC ALGORITHMS / 588 26.3.5
MEMETIC METHODS / 590 26.3.6 PROBLEM-SPECIFIC IMPROVEMENTS / 592 26.4
CONCLUSION / 594 REFERENCES / 595
IMAGE 17
CONTENTS XXI
27 MAXIMUM ENTROPY METHOD FOR COMPOSITION
VECTOR METHOD 599
RAYMOND H.-F. CHAN, ROGER W. WANG, AND JEFF C.-F. WONG
27.1 INTRODUCTION / 599 27.2 MODELS AND ENTROPY OPTIMIZATION / 601
27.2.1 DEFINITIONS / 601 27.2.2 DENOISING FORMULAS LI 603
27.2.3 DISTANCE MEASURE / 611 27.2.4 PHYLOGENETIC TREE CONSTRUCTION /
613 27.3 APPLICATION AND DICUSSION / 614 27.3.1 EXAMPLE 1 / 6 14
27.3.2 EXAMPLE 2 / 614 27.3.3 EXAMPLE 3 / 615 27.3.4 EXAMPLE 4 / 617
27.4 CONCLUDING REMARKS / 619 REFERENCES / 619
V MICROARRAY DATA ANALYSIS 623
28 MICROARRAY GENE EXPRESSION DATA ANALYSIS 625 ALAN WEE-CHUNG LIEW AND
XIANGCHAO GAN
28.1 INTRODUCTION / 625 28.2 DNA MICROARRAY TECHNOLOGY AND EXPERIMENT /
626 28.3 IMAGE ANALYSIS AND EXPRESSION DATA EXTRACTION / 627 28.3.1
IMAGE PREPROCESSING / 628
28.3.2 BLOCK SEGMENTATION / 628 28.3.3 AUTOMATIC GRIDDING / 628 28.3.4
SPOT EXTRACTION / 628 28.4 DATA PROCESSING / 630
28.4.1 BACKGROUND CORRECTION / 630 28.4.2 NORMALIZATION / 630 28.4.3
DATA FILTERING / 631 28.5 MISSING VALUE IMPUTATION / 631 28.6 TEMPORAL
GENE EXPRESSION PROFILE ANALYSIS / 634
28.7 CYCLIC GENE EXPRESSION PROFILES DETECTION / 640 28.7.1 SSA-AR
SPECTRAL ESTIMATION / 643 28.7.2 SPECTRAL ESTIMATION BY SIGNAL
RECONSTRUCTION / 644 28.7.3 STATISTICAL HYPOTHESIS TESTING FOR PERIODIC
PROFILE
DETECTION / 646
28.8 SUMMARY / 647 ACKNOWLEDGMENTS / 648 REFERENCES / 649
IMAGE 18
XXII CONTENTS
29 BICLUSTERING OF MICROARRAY DATA 651
WASSIM AYADI AND MOURAD ELLOUMI
29.1 INTRODUCTION / 651 29.2 TYPES OF BICLUSTERS / 652 29.3 GROUPS OF
BICLUSTERS / 653 29.4 EVALUATION FUNCTIONS / 654 29.5 SYSTEMATIC AND
STOCHASTIC BICLUSTERING ALGORITHMS / 656 29.6 BIOLOGICAL VALIDATION /
659 29.7 CONCLUSION / 661 REFERENCES / 661
30 COMPUTATIONAL MODELS FOR CONDITION-SPECIFIC GENE AND PATHWAY
INFERENCE 665
YU-QING QIU, SHIHUA ZHANG, XIANG-SUN ZHANG, AND LUONAN CHEN
30.1 INTRODUCTION / 665 30.2 CONDITION-SPECIFIC PATHWAY IDENTIFICATION /
666 30.2.1 GENE SET ANALYSIS / 667 30.2.2 CONDITION-SPECIFIC PATHWAY
INFERENCE / 671
30.3 DISEASE GENE PRIORITIZATION AND GENETIC PATHWAY DETECTION / 681
30.4 MODULE NETWORKS / 684 30.5 SUMMARY / 685 ACKNOWLEDGMENTS / 685
REFERENCES / 685
31 HETEROGENEITY OF DIFFERENTIAL EXPRESSION IN CANCER STUDIES:
ALGORITHMS AND METHODS 691 RADHA KRISHNA MURTHY KARUTURI
31.1 INTRODUCTION / 691 31.2 NOTATIONS / 692 31.3 DIFFERENTIAL MEAN OF
EXPRESSION / 694 31.3.1 SINGLE FACTOR DIFFERENTIAL EXPRESSION / 695
31.3.2 MULTIFACTOR DIFFERENTIAL EXPRESSION / 697 31.3.3 EMPIRICAL BAYES
EXTENSION / 698 31.4 DIFFERENTIAL VARIABILITY OF EXPRESSION / 699 31.4.1
F -TEST FOR TWO-GROUP DIFFERENTIAL VARIABILITY ANALYSIS / 699
31.4.2 BARTLETT S AND LEVENE S TESTS FOR MULTIGROUP DIFFERENTIAL
VARIABILITY ANALYSIS / 700 31.5 DIFFERENTIAL EXPRESSION IN COMPENDIUM OF
TUMORS / 701 31.5.1 GAUSSIAN MIXTURE MODEL (GMM) FOR FINITE LEVELS OF
EXPRESSION / 701 31.5.2 OUTLIER DETECTION STRATEGY / 703 31.5.3 KURTOSIS
EXCESS / 704
IMAGE 19
CONTENTS XXIII
31.6 DIFFERENTIAL EXPRESSION BY CHROMOSOMAL ABERRATIONS: THE LOCAL
PROPERTIES / 705 31.6.1 WAVELET VARIANCE SCANNING (WAVES) FOR
SINGLE-SAMPLE ANALYSIS / 708 31.6.2 LOCAL SINGULAR VALUE DECOMPOSITION
(LSVD) FOR
COMPENDIUM OF TUMORS / 709 31.6.3 LOCALLY ADAPTIVE STATISTICAL PROCEDURE
(LAP) FOR COMPENDIUM OF TUMORS WITH CONTROL SAMPLES / 710 31.7
DIFFERENTIAL EXPRESSION IN GENE INTERACTOME / 711
31.7.1 FRIENDLY NEIGHBORS ALGORITHM: A MULTIPLICATIVE INTERACTOME / 711
31.7.2 GENERANK: A CONTRIBUTING INTERACTOME / 712 31.7.3 TOP SCORING
PAIRS (TSP): A DIFFERENTIAL INTERACTOME / 713 31.8 DIFFERENTIAL
COEXPRESSION: GLOBAL MULTIDIMENSIONAL
INTERACTOME / 714 31.8.1 KOSTKA AND SPANG S DIFFERENTIAL COEXPRESSION
ALGORITHM / 715 31.8.2 DIFFERENTIAL EXPRESSION LINKED DIFFERENTIAL
COEXPRESSION / 718 31.8.3 DIFFERENTIAL FRIENDLY NEIGHBORS (DIFFFNS) /
718 ACKNOWLEDGMENTS / 720 REFERENCES / 720
VI ANALYSIS OF GENOMES 723
32 COMPARATIVE GENOMICS: ALGORITHMS AND APPLICATIONS 725
XIAO YANG AND SRINIVAS ALURU
32.1 INTRODUCTION / 725
32.2 NOTATIONS / 727 32.3 ORTHOLOG ASSIGNMENT / 727 32.3.1 SEQUENCE
SIMILARITY-BASED METHOD / 729 32.3.2 PHYLOGENY-BASED METHOD / 731
32.3.3 REARRANGEMENT-BASED METHOD / 732 32.4 GENE CLUSTER AND SYNTENY
DETECTION / 734 32.4.1 SYNTENY DETECTION / 736 32.4.2 GENE CLUSTER
DETECTION / 739
32.5 CONCLUSIONS / 743 REFERENCES / 743
IMAGE 20
XXIV CONTENTS
33 ADVANCES IN GENOME REARRANGEMENT ALGORITHMS 749
MASUD HASAN AND M. SOHEL RAHMAN
33.1 INTRODUCTION / 749 33.2 PRELIMINARIES / 752 33.3 SORTING BY
REVERSALS / 753 33.3.1 APPROACHES TO APPROXIMATION ALGORITHMS / 754
33.3.2 SIGNED PERMUTATIONS / 757 33.4 SORTING BY TRANSPOSITIONS / 759
33.4.1 APPROXIMATION RESULTS / 760
33.4.2 IMPROVED RUNNING TIME AND SIMPLER ALGORITHMS / 761 33.5 OTHER
OPERATIONS / 761 33.5.1 SORTING BY PREFIX REVERSALS / 761 33.5.2 SORTING
BY PREFIX TRANSPOSITIONS / 762
33.5.3 SORTING BY BLOCK INTERCHANGE / 762 33.5.4 SHORT SWAP AND
FIXED-LENGTH REVERSALS / 763 33.6 SORTING BY MORE THAN ONE OPERATION /
763 33.6.1 UNIFIED OPERATION: DOULE CUT AND JOIN / 764 33.7 FUTURE
RESEARCH DIRECTIONS / 765 33.8 NOTES ON SOFTWARE / 766 REFERENCES / 767
34 COMPUTING GENOMIC DISTANCES: AN ALGORITHMIC VIEWPOINT 773
GUILLAUME FERTIN AND IRENA RUSU
34.1 INTRODUCTION / 773
34.1.1 WHAT THIS CHAPTER IS ABOUT / 773 34.1.2 DEFINITIONS AND NOTATIONS
/ 774 34.1.3 ORGANIZATION OF THE CHAPTER / 775 34.2 INTERVAL-BASED
CRITERIA / 775 34.2.1 BRIEF INTRODUCTION / 775 34.2.2 THE CONTEXT AND
THE PROBLEMS / 776 34.2.3 COMMON INTERVALS IN PERMUTATIONS AND THE
COMMUTING
GENERATORS STRATEGY / 778 34.2.4 CONSERVED INTERVALS IN PERMUTATIONS AND
THE BOUND-AND-DROP STRATEGY / 782 34.2.5 COMMON INTERVALS IN STRINGS AND
THE ELEMENT PLOTTING
STRATEGY / 783
34.2.6 VARIANTS / 785 34.3 CHARACTER-BASED CRITERIA / 785 34.3.1
INTRODUCTION AND DEFINITION OF THE PROBLEMS / 785 34.3.2 AN
APPROXIMATION ALGORITHM FOR BAL-FMB / 787
IMAGE 21
CONTENTS XXV
34.3.3 AN EXACT ALGORITHM FOR UNBAL-FMB. / 791
34.3.4 OTHER RESULTS AND OPEN PROBLEMS / 795 34.4 CONCLUSION / 795
REFERENCES / 796
35 WAVELET ALGORITHMS FOR DNA ANALYSIS 799
CARLO CATTANI
35.1 INTRODUCTION / 799 35.2 DNA REPRESENTATION / 802 35.2.1 PRELIMINARY
REMARKS ON DNA / 802 35.2.2 INDICATOR FUNCTION / 803
35.2.3 REPRESENTATION / 806 35.2.4 REPRESENTATION MODELS / 807 35.2.5
CONSTRAINTS ON THE REPRESENTATION IN R 2 / 808 35.2.6 COMPLEX
REPRESENTATION / 810
35.2.7 DNA WALKS / 810 35.3 STATISTICAL CORRELATIONS IN DNA / 812 35.3.1
LONG-RANGE CORRELATION / 812 35.3.2 POWER SPECTRUM / 814
35.3.3 COMPLEXITY / 817 35.4 WAVELET ANALYSIS / 818 35.4.1 HAAR WAVELET
BASIS / 819 35.4.2 HAAR SERIES / 819
35.4.3 DISCRETE HAAR WAVELET TRANSFORM / 821 35.5 HAAR WAVELET
COEFFICIENTS AND STATISTICAL PARAMETERS / 823 35.6 ALGORITHM OF THE
SHORT HAAR DISCRETE WAVELET TRANSFORM / 826
35.7 CLUSTERS OF WAVELET COEFFICIENTS / 828 35.7.1 CLUSTER ANALYSIS OF
THE WAVELET COEFFICIENTS OF THE COMPLEX DNA REPRESENTATION / 830 35.7.2
CLUSTER ANALYSIS OF THE WAVELET COEFFICIENTS OF DNA
WALKS / 834
35.8 CONCLUSION / 838 REFERENCES / 839
36 HAPLOTYPE INFERENCE MODELS AND ALGORITHMS 843 LING-YUN WU
36.1 INTRODUCTION / 843 36.2 PROBLEM STATEMENT AND NOTATIONS / 844 36.3
COMBINATORIAL METHODS / 846 36.3.1 CLARK S INFERENCE RULE / 846
IMAGE 22
XXVI CONTENTS
36.3.2 PURE PARSIMONY MODEL / 848
36.3.3 PHYLOGENY METHODS / 849 36.4 STATISTICAL METHODS / 851 36.4.1
MAXIMUM LIKELIHOOD METHODS / 851
36.4.2 BAYESIAN METHODS / 852 36.4.3 MARKOV CHAIN METHODS / 852 36.5
PEDIGREE METHODS / 853 36.5.1 MINIMUM RECOMBINANT HAPLOTYPE
CONFIGURATIONS / 854
36.5.2 ZERO RECOMBINANT HAPLOTYPE CONFIGURATIONS / 854 36.5.3
STATISTICAL METHODS / 855 36.6 EVALUATION / 856 36.6.1 EVALUATION
MEASUREMENTS / 856
36.6.2 COMPARISONS / 857 36.6.3 DATASETS / 857 36.7 DISCUSSION / 858
REFERENCES / 859
VII ANALYSIS OF BIOLOGICAL NETWORKS 865
37 UNTANGLING BIOLOGICAL NETWORKS USING BIOINFORMATICS 867
GAURAV KUMAR, ADRIAN P. COOTES, AND SHOBA RANGANATHAN
37.1 INTRODUCTION / 867
37.1.1 PREDICTING BIOLOGICAL PROCESSES: A MAJOR CHALLENGE TO
UNDERSTANDING BIOLOGY / 867 37.1.2 HISTORICAL PERSPECTIVE AND
MATHEMATICAL PRELIMINARIES OF NETWORKS / 868 37.1.3 STRUCTURAL
PROPERTIES OF BIOLOGICAL NETWORKS / 870 37.1.4 LOCAL TOPOLOGY OF
BIOLOGICAL NETWORKS: FUNCTIONAL
MOTIFS, MODULES, AND COMMUNITIES / 873 37.2 TYPES OF BIOLOGICAL NETWORKS
/ 878 37.2.1 PROTEIN-PROTEIN INTERACTION NETWORKS / 878 37.2.2 METABOLIC
NETWORKS / 879
37.2.3 TRANSCRIPTIONAL NETWORKS / 881 37.2.4 OTHER BIOLOGICAL NETWORKS /
883 37.3 NETWORK DYNAMIC, EVOLUTION AND DISEASE / 884 37.3.1 BIOLOGICAL
NETWORK DYNAMIC AND EVOLUTION / 884
37.3.2 BIOLOGICAL NETWORKS AND DISEASE / 886 37.4 FUTURE CHALLENGES AND
SCOPE / 887 ACKNOWLEDGMENTS / 887 REFERENCES / 888
IMAGE 23
CONTENTS XXVIL
38 PROBABILISTIC APPROACHES FOR INVESTIGATING
BIOLOGICAL NETWORKS 893
JEREMIE BOURDON AND DAMIEN EVEILLARD
38.1 PROBABILISTIC MODELS FOR BIOLOGICAL NETWORKS / 894
38.1.1 BOOLEAN NETWORKS / 895 38.1.2 PROBABILISTIC BOOLEAN NETWORKS: A
NATURAL EXTENSION / 900 38.1.3 INFERRING PROBABILISTIC. MODELS FROM
EXPERIMENTS / 901 38.2 INTERPRETATION AND QUANTITATIVE ANALYSIS OF
PROBABILISTIC MODELS / 902
38.2.1 DYNAMICAL ANALYSIS AND TEMPORAL PROPERTIES / 902 38.2.2 IMPACT OF
UPDATE STRATEGIES FOR ANALYZING PROBABILISTIC BOOLEAN NETWORKS / 905
38.2.3 SIMULATIONS OF A PROBABILISTIC BOOLEAN NETWORK / 906 38.3
CONCLUSION / 911 ACKNOWLEDGMENTS / 911 REFERENCES / 911
39 MODELING AND ANALYSIS OF BIOLOGICAL NETWORKS WITH MODEL CHECKING 915
DRAGAN BOSNACKI, PETER A. J. HILBERS, RONNY S. MANS, AND ERIK P. DE WINK
39.1 INTRODUCTION / 915 39.2 PRELIMINARIES / 916 39.2.1 MODEL CHECKING /
916 39.2.2 SPIN AND PROMELA / 917
39.2.3 LTL / 918 39.3 ANALYZING GENETIC NETWORKS WITH MODEL CHECKING /
919 39.3.1 BOOLEAN REGULATORY NETWORKS / 919 39.3.2 A CASE STUDY / 919
39.3.3 TRANSLATING BOOLEAN REGULATORY GRAPHS INTO PROMELA / 921 39.3.4
SOME RESULTS / 922 39.3.5 CONCLUDING REMARKS / 924
39.3.6 RELATED WORK AND BIBLIOGRAPHIC NOTES / 924 39.4 PROBABILISTIC
MODEL CHECKING FOR BIOLOGICAL SYSTEMS / 925 39.4.1 MOTIVATION AND
BACKGROUND / 926
39.4.2 A KINETIC MODEL OF MRNA TRANSLATION / 927 39.4.3 PROBABILISTIC
MODEL CHECKING / 928 39.4.4 THE PRISM MODEL / 929 39.4.5 INSERTION
ERRORS / 933
39.4.6 CONCLUDING REMARKS / 934 39.4.7 RELATED WORK AND BIBLIOGRAPHIC
NOTES / 935 REFERENCES / 936
IMAGE 24
XXVIII CONTENTS
40 REVERSE ENGINEERING OF MOLECULAR NETWORKS
FROM A COMMON COMBINATORIAL APPROACH 941 BHASKAR DASGUPTA, PAOLA
VERA-LICONA, AND EDUARDO SONTAG
40.1 INTRODUCTION / 941 40.2 REVERSE-ENGINEERING OF BIOLOGICAL NETWORKS
/ 942 40.2.1 EVALUATION OF THE PERFORMANCE OF REVERSE-ENGINEERING
METHODS / 945 40.3 CLASSICAL COMBINATORIAL ALGORITHMS: A CASE STUDY /
946
40.3.1 BENCHMARKING RE COMBINATORIAL-BASED METHODS / 947 40.3.2 SOFTWARE
AVAILABILITY / 950 40.4 CONCLUDING REMARKS / 951 ACKNOWLEDGMENTS / 951
REFERENCES / 951
41 UNSUPERVISED LEARNING FOR GENE REGULATION NETWORK INFERENCE FROM
EXPRESSION DATA: A REVIEW 955
MOHAMED ELATI AND CELINE ROUVEIROL
41.1 INTRODUCTION / 955 41.2 GENE NETWORKS: DEFINITION AND PROPERTIES /
956 41.3 GENE EXPRESSION: DATA AND ANALYSIS / 958 41.4 NETWORK INFERENCE
AS AN UNSUPERVISED LEARNING PROBLEM / 959
41.5 CORRELATION-BASED METHODS / 959 41.6 PROBABILISTIC GRAPHICAL MODELS
/ 961 41.7 CONSTRAINT-BASED DATA MINING / 963 41.7.1 MULTIPLE USAGES OF
EXTRACTED PATTERNS / 965
41.7.2 MINING GENE REGULATION FROM TRANSCRIPTOME DATASETS / 966 41.8
VALIDATION / 969 41.8.1 STATISTICAL VALIDATION OF NETWORK INFERENCE /
970
41.8.2 BIOLOGICAL VALIDATION / 972 41.9 CONCLUSION AND PERSPECTIVES /
973 REFERENCES / 974
42 APPROACHES TO CONSTRUCTION AND ANALYSIS OF MICRORNA-MEDIATED NETWORKS
979
LIANA LICHTENSTEIN, ALBERT ZOMAYA, JENNIFER GAMBLE, AND MATHEW VADAS
42.1 INTRODUCTION / 979
42.1.1 MIRNA-MEDIATED GENETIC REGULATORY NETWORKS / 979 42.1.2 THE FOUR
LEVELS OF REGULATION IN GRNS / 981 42.1.3 OVERVIEW OF SECTIONS / 982
IMAGE 25
CONTENTS XXIX
42.2 FUNDAMENTAL COMPONENT INTERACTION RESEARCH: PREDICTING
MIRNA GENES, REGULATORS, AND TARGETS / 982 42.2.1 PREDICTION OF NOVEL
MIRNA GENES / 983 42.2.2 PREDICTION OF MIRNA TARGETS / 984 42.2.3
PREDICTION OF MIRNA TRANSCRIPT ELEMENTS AND
TRANSCRIPTIONAL REGULATION / 984 42.3 IDENTIFYING MIRNA-MEDIATED
NETWORKS / 988 42.3.1 FORWARD ENGINEERING-CONSTRUCTION OF MULTINODE
COMPONENTS IN MIRNA-MEDIATED NETWORKS USING
PAIRED INTERACTION INFORMATION / 988 42.3.2 REVERSE
ENGINEERING-INFERENCE OF MICRORNA MODULES USING TOP-DOWN APPROACHES /
988 42.4 GLOBAL AND LOCAL ARCHITECTURE ANALYSIS IN MIRNA-CONTAINING
NETWORKS / 993 42.4.1 GLOBAL ARCHITECTURE PROPERTIES OF MIRNA-MEDIATED
POST-TRANSCRIPTIONAL NETWORKS / 993 42.4.2 LOCAL ARCHITECTURE PROPERTIES
OF MIRNA-MEDIATED
POST-TRANSCRIPTIONAL NETWORKS / 994 42.5 CONCLUSION / 1001 REFERENCES /
1001
INDEX 1007
|
any_adam_object | 1 |
building | Verbundindex |
bvnumber | BV037277083 |
classification_rvk | WC 7700 |
classification_tum | BIO 110f BIO 220f BIO 105f |
ctrlnum | (OCoLC)732300785 (DE-599)BVBBV037277083 |
dewey-full | 572.80285 |
dewey-hundreds | 500 - Natural sciences and mathematics |
dewey-ones | 572 - Biochemistry |
dewey-raw | 572.80285 |
dewey-search | 572.80285 |
dewey-sort | 3572.80285 |
dewey-tens | 570 - Biology |
discipline | Biologie Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01878nam a2200469 c 4500</leader><controlfield tag="001">BV037277083</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20181130 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">110314s2011 ad|| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780470505199</subfield><subfield code="9">978-0-470-50519-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)732300785</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV037277083</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-11</subfield><subfield code="a">DE-M49</subfield><subfield code="a">DE-19</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">572.80285</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">WC 7700</subfield><subfield code="0">(DE-625)148144:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIO 110f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIO 220f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIO 105f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Algorithms in computational molecular biology</subfield><subfield code="b">techniques, approaches and applications</subfield><subfield code="c">ed. by Mourad Elloumi ...</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Hoboken, NJ</subfield><subfield code="b">Wiley</subfield><subfield code="c">2011</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXXVII, 1044 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Wiley series on bioinformatics: Computational techniques and engineering</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturangaben</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Molekularbiologie</subfield><subfield code="0">(DE-588)4039983-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Algorithmus</subfield><subfield code="0">(DE-588)4001183-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Bioinformatik</subfield><subfield code="0">(DE-588)4611085-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Biomathematik</subfield><subfield code="0">(DE-588)4139408-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4143413-4</subfield><subfield code="a">Aufsatzsammlung</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Molekularbiologie</subfield><subfield code="0">(DE-588)4039983-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Biomathematik</subfield><subfield code="0">(DE-588)4139408-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Bioinformatik</subfield><subfield code="0">(DE-588)4611085-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Algorithmus</subfield><subfield code="0">(DE-588)4001183-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="C">b</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Elloumi, Mourad</subfield><subfield code="e">Sonstige</subfield><subfield code="4">oth</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021189930&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-021189930</subfield></datafield></record></collection> |
genre | (DE-588)4143413-4 Aufsatzsammlung gnd-content |
genre_facet | Aufsatzsammlung |
id | DE-604.BV037277083 |
illustrated | Illustrated |
indexdate | 2024-07-09T22:55:04Z |
institution | BVB |
isbn | 9780470505199 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-021189930 |
oclc_num | 732300785 |
open_access_boolean | |
owner | DE-11 DE-M49 DE-BY-TUM DE-19 DE-BY-UBM |
owner_facet | DE-11 DE-M49 DE-BY-TUM DE-19 DE-BY-UBM |
physical | XXXVII, 1044 S. Ill., graph. Darst. |
publishDate | 2011 |
publishDateSearch | 2011 |
publishDateSort | 2011 |
publisher | Wiley |
record_format | marc |
series2 | Wiley series on bioinformatics: Computational techniques and engineering |
spelling | Algorithms in computational molecular biology techniques, approaches and applications ed. by Mourad Elloumi ... Hoboken, NJ Wiley 2011 XXXVII, 1044 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Wiley series on bioinformatics: Computational techniques and engineering Literaturangaben Molekularbiologie (DE-588)4039983-7 gnd rswk-swf Algorithmus (DE-588)4001183-5 gnd rswk-swf Bioinformatik (DE-588)4611085-9 gnd rswk-swf Biomathematik (DE-588)4139408-2 gnd rswk-swf (DE-588)4143413-4 Aufsatzsammlung gnd-content Molekularbiologie (DE-588)4039983-7 s Biomathematik (DE-588)4139408-2 s Bioinformatik (DE-588)4611085-9 s Algorithmus (DE-588)4001183-5 s b DE-604 Elloumi, Mourad Sonstige oth HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021189930&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Algorithms in computational molecular biology techniques, approaches and applications Molekularbiologie (DE-588)4039983-7 gnd Algorithmus (DE-588)4001183-5 gnd Bioinformatik (DE-588)4611085-9 gnd Biomathematik (DE-588)4139408-2 gnd |
subject_GND | (DE-588)4039983-7 (DE-588)4001183-5 (DE-588)4611085-9 (DE-588)4139408-2 (DE-588)4143413-4 |
title | Algorithms in computational molecular biology techniques, approaches and applications |
title_auth | Algorithms in computational molecular biology techniques, approaches and applications |
title_exact_search | Algorithms in computational molecular biology techniques, approaches and applications |
title_full | Algorithms in computational molecular biology techniques, approaches and applications ed. by Mourad Elloumi ... |
title_fullStr | Algorithms in computational molecular biology techniques, approaches and applications ed. by Mourad Elloumi ... |
title_full_unstemmed | Algorithms in computational molecular biology techniques, approaches and applications ed. by Mourad Elloumi ... |
title_short | Algorithms in computational molecular biology |
title_sort | algorithms in computational molecular biology techniques approaches and applications |
title_sub | techniques, approaches and applications |
topic | Molekularbiologie (DE-588)4039983-7 gnd Algorithmus (DE-588)4001183-5 gnd Bioinformatik (DE-588)4611085-9 gnd Biomathematik (DE-588)4139408-2 gnd |
topic_facet | Molekularbiologie Algorithmus Bioinformatik Biomathematik Aufsatzsammlung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=021189930&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT elloumimourad algorithmsincomputationalmolecularbiologytechniquesapproachesandapplications |