Abstractions for performance programming on multi-core architectures with hierarchical memory:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
Aachen
Apprimus Verlag
2016
|
Ausgabe: | 1. Auflage |
Schriftenreihe: | Ergebnisse aus der Informatik
7 Edition Wissenschaft Apprimus |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | II, 160 Seiten Diagramme |
ISBN: | 9783863594428 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV043741491 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 160830s2016 |||| m||| 00||| eng d | ||
020 | |a 9783863594428 |9 978-3-86359-442-8 | ||
035 | |a (OCoLC)958176613 | ||
035 | |a (DE-599)BVBBV043741491 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-29T | ||
100 | 1 | |a Terboven, Christian |e Verfasser |0 (DE-588)1078473412 |4 aut | |
245 | 1 | 0 | |a Abstractions for performance programming on multi-core architectures with hierarchical memory |
250 | |a 1. Auflage | ||
264 | 1 | |a Aachen |b Apprimus Verlag |c 2016 | |
300 | |a II, 160 Seiten |b Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Ergebnisse aus der Informatik |v 7 | |
490 | 0 | |a Edition Wissenschaft Apprimus | |
502 | |b Dissertation |c RWTH Aachen University |d 2016 | ||
650 | 0 | 7 | |a NUMA-Architektur |0 (DE-588)1123829519 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a OpenMP |0 (DE-588)4648816-9 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Parallelverarbeitung |0 (DE-588)4075860-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Programmierung |0 (DE-588)4076370-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Speicherhierarchie |0 (DE-588)4256353-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Mehrkernprozessor |0 (DE-588)7598578-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Abstraktionsebene |0 (DE-588)4254804-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Gemeinsamer Speicher |0 (DE-588)4294156-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Mehrprozessorsystem |0 (DE-588)4038397-0 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Mehrkernprozessor |0 (DE-588)7598578-0 |D s |
689 | 0 | 1 | |a Speicherhierarchie |0 (DE-588)4256353-7 |D s |
689 | 0 | 2 | |a Gemeinsamer Speicher |0 (DE-588)4294156-8 |D s |
689 | 0 | 3 | |a Parallelverarbeitung |0 (DE-588)4075860-6 |D s |
689 | 0 | 4 | |a Programmierung |0 (DE-588)4076370-5 |D s |
689 | 0 | 5 | |a Abstraktionsebene |0 (DE-588)4254804-4 |D s |
689 | 0 | 6 | |a Mehrprozessorsystem |0 (DE-588)4038397-0 |D s |
689 | 0 | 7 | |a NUMA-Architektur |0 (DE-588)1123829519 |D s |
689 | 0 | 8 | |a OpenMP |0 (DE-588)4648816-9 |D s |
689 | 0 | |5 DE-604 | |
710 | 2 | |a Apprimus Verlag |e Sonstige |0 (DE-588)1068101474 |4 oth | |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029153207&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-029153207 |
Datensatz im Suchindex
_version_ | 1804176543347376128 |
---|---|
adam_text | CONTENTS
1. INTRODUCTION 1
1.1. MOTIVATION AND G O A LS
.............................................................................
4
1.2. C
ONTRIBUTIONS...........................................................................................
7
1.3. THESIS S TRU C TU R E
.....................................................................................
9
2. BENCHMARKS AND SYSTEM ARCHITECTURES 11
2.1. NUMA SYSTEM ARCHITECTURES
............................................................... 12
2.1.1. 2-SOCKET INTEL WESTMERE-EP S Y S TE M
........................................ 13
2.1.2. BULL BGS SYSTEM (INTEL NEHALEM-EX)
.....................................
13
2.2. LINUX PAGE S IZ E S
.....................................................................................
15
2.3. SPECIFIC K
ERNELS........................................................................................
16
2.3.1. STR E A M
.....................................................................................
18
2.3.2. SPARSE-MATRIX-VECTOR-MULTIPLICATION (SPMXV)
.........................
22
2.3.3. GENERALIZED MINIMAL RESIDUAL METHOD (G M RES)
.................
26
2.4. BENCHMARK M
ETHODOLOGY...........................................................................27
3. SOFTWARE DESIGN FOR NUMA ARCHITECTURES 31
3.1. PERFORMANCE MODELING AND MACHINE C HARACTERISTICS
.............................
32
3.1.1. PERFORMANCE MODELING OF THE SPMXV K
ERNEL..........................36
3.1.2. PERFORMANCE MODELING RESULTS OF STREAM
............................
43
3.2. RELATED W O RK
...............................................................................................44
3.2.1. EXPLOITATION OF AND OPTIMIZATION FOR NUM
A..........................44
3.2.2. THREAD BINDING ON NUMA
..........................................................45
3.2.3. EXTENDING PROGRAMMING MODELS TOWARDS N U M A
.....................
45
3.3. QUANTIFYING THREAD A FFIN ITY
.....................................................................46
3.3.1. MACHINE MODEL: DISTANCE M ATRIX
................................................. 47
3.3.2. DISTANCE MATRICES FOR BENCHMARK A RCHITECTURES
.......................
49
3.4. THREAD AFFINITY STRATEGIES
........................................................................51
3.4.1. THREAD AFFINITY
REQUIREMENTS.......................................................55
3.4.2. THREAD AFFINITY MODEL: P L A C E S
....................................................58
3.4.3. THREAD AFFINITY MODEL:
POLICIES....................................................60
3.4.4. FINAL MEASUREMENTS WITH THE THREAD AFFINITY M O D E L
...........
64
3.5. MEMORY A FFIN ITY
.........................................................................................66
3.5.1. SUPPORT FOR NESTED P ARALLELISM
....................................................70
3.5.2. TASKS ON NUMA: PROGRAMMING P A TTE R N S
...............................
71
CONTENTS
3.5.3. TASKS ON NUMA: IMPLEMENTATION
COMPARISON...........................77
3.5.4. TOWARDS EXPLICIT MEMORY AFFINITY IN O P EN M P
.......................
79
3.5.5. TOWARDS IMPROVED OS SUPPORT FOR N U M A
............................
85
3.6. RESULTS AND D
ISCUSSION............................................................................
88
4. PARALLELIZATION WITH OBJECT-ORIENTED ABSTRACTIONS 91
4.1. STATE OF THE ART AND RELATED W O RK
......................................................
92
4.2. THREAD- AND TASK-BASED ABSTRACTIONS FOR P ARALLELISM
.......................
94
4.2.1. CLASS DESIGN FOR SPARSE LINEAR ALGEBRA K E RN E LS
....................
100
4.2.2. PARALLEL IMPLEMENTATION: STRATEGY P A TTE R N
..............................
104
4.2.3. IMPLEMENTATION AS A LIBRARY: TEMPLATE M ETHOD
.......................
107
4.2.4. EMPLOYMENT IN EXISTING CODES: ADAPTER P A TTE R N
...................
108
4.3. ABSTRACTIONS FOR NUM
A............................................................................109
4.3.1. MEMORY MANAGEMENT: ALLOCATOR P A TTE R N
.................................
109
4.3.2. ARCHITECTURE EXAMINATION: SINGLETON PATTERN
.........................112
4.4. SUPPORT FOR CLASS-TYPE VARIABLES IN O P EN M P
.......................................114
4.4.1. SCOPING S EM A N
TIC.........................................................................
114
4.4.2. THREAD SAFETY OF STL D A TA TY P E S
.................................................117
4.5. RESULTS AND D
ISCUSSION............................................................................118
5. APPLICATION CASE STUDIES 121
5.1. JACOBI M E TH O D
..........................................................................................
121
5.2. FLEXIBLE IMAGE RETRIEVAL ENGINE (F IR E )
.................................................
123
5.3. NAVIER-STOKES SOLVER DROPS (D R O P S )
...............................................
125
5.4. SHORT APPLICATION CASE S TU D IE S
..............................................................126
5.4.1. V RFEM
..........................................................................................
127
5.4.2. MULTI-THREADING IN N E S T
...........................................................127
6. SUMMARY AND OUTLOOK 133
LIST OF FIGURES 137
LIST OF TABLES 140
STATEMENT OF ORIGINALITY 151
A. APPENDIX 155
A.L. RELATED SOURCE C
ODE..................................................................................155
A. 1.1. HEAPMANAGER
.............................................................................
155
A. 1.2. DISTRIBUTEDOM PALLOCATOR
.........................................................
156
A. 1.3.
SCOPETIMER...................................................................................
158
A.2. ACRONYMS AND NOMENCLATURE
...................................................................
160
|
any_adam_object | 1 |
author | Terboven, Christian |
author_GND | (DE-588)1078473412 |
author_facet | Terboven, Christian |
author_role | aut |
author_sort | Terboven, Christian |
author_variant | c t ct |
building | Verbundindex |
bvnumber | BV043741491 |
ctrlnum | (OCoLC)958176613 (DE-599)BVBBV043741491 |
edition | 1. Auflage |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02463nam a2200565 cb4500</leader><controlfield tag="001">BV043741491</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160830s2016 |||| m||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783863594428</subfield><subfield code="9">978-3-86359-442-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)958176613</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043741491</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Terboven, Christian</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1078473412</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Abstractions for performance programming on multi-core architectures with hierarchical memory</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. Auflage</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Aachen</subfield><subfield code="b">Apprimus Verlag</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">II, 160 Seiten</subfield><subfield code="b">Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Ergebnisse aus der Informatik</subfield><subfield code="v">7</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Edition Wissenschaft Apprimus</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="b">Dissertation</subfield><subfield code="c">RWTH Aachen University</subfield><subfield code="d">2016</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">NUMA-Architektur</subfield><subfield code="0">(DE-588)1123829519</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">OpenMP</subfield><subfield code="0">(DE-588)4648816-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Parallelverarbeitung</subfield><subfield code="0">(DE-588)4075860-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Programmierung</subfield><subfield code="0">(DE-588)4076370-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Speicherhierarchie</subfield><subfield code="0">(DE-588)4256353-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Mehrkernprozessor</subfield><subfield code="0">(DE-588)7598578-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Abstraktionsebene</subfield><subfield code="0">(DE-588)4254804-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Gemeinsamer Speicher</subfield><subfield code="0">(DE-588)4294156-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Mehrprozessorsystem</subfield><subfield code="0">(DE-588)4038397-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Mehrkernprozessor</subfield><subfield code="0">(DE-588)7598578-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Speicherhierarchie</subfield><subfield code="0">(DE-588)4256353-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Gemeinsamer Speicher</subfield><subfield code="0">(DE-588)4294156-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Parallelverarbeitung</subfield><subfield code="0">(DE-588)4075860-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="4"><subfield code="a">Programmierung</subfield><subfield code="0">(DE-588)4076370-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="5"><subfield code="a">Abstraktionsebene</subfield><subfield code="0">(DE-588)4254804-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="6"><subfield code="a">Mehrprozessorsystem</subfield><subfield code="0">(DE-588)4038397-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="7"><subfield code="a">NUMA-Architektur</subfield><subfield code="0">(DE-588)1123829519</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="8"><subfield code="a">OpenMP</subfield><subfield code="0">(DE-588)4648816-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Apprimus Verlag</subfield><subfield code="e">Sonstige</subfield><subfield code="0">(DE-588)1068101474</subfield><subfield code="4">oth</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029153207&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-029153207</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV043741491 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T07:33:51Z |
institution | BVB |
institution_GND | (DE-588)1068101474 |
isbn | 9783863594428 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-029153207 |
oclc_num | 958176613 |
open_access_boolean | |
owner | DE-29T |
owner_facet | DE-29T |
physical | II, 160 Seiten Diagramme |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | Apprimus Verlag |
record_format | marc |
series2 | Ergebnisse aus der Informatik Edition Wissenschaft Apprimus |
spelling | Terboven, Christian Verfasser (DE-588)1078473412 aut Abstractions for performance programming on multi-core architectures with hierarchical memory 1. Auflage Aachen Apprimus Verlag 2016 II, 160 Seiten Diagramme txt rdacontent n rdamedia nc rdacarrier Ergebnisse aus der Informatik 7 Edition Wissenschaft Apprimus Dissertation RWTH Aachen University 2016 NUMA-Architektur (DE-588)1123829519 gnd rswk-swf OpenMP (DE-588)4648816-9 gnd rswk-swf Parallelverarbeitung (DE-588)4075860-6 gnd rswk-swf Programmierung (DE-588)4076370-5 gnd rswk-swf Speicherhierarchie (DE-588)4256353-7 gnd rswk-swf Mehrkernprozessor (DE-588)7598578-0 gnd rswk-swf Abstraktionsebene (DE-588)4254804-4 gnd rswk-swf Gemeinsamer Speicher (DE-588)4294156-8 gnd rswk-swf Mehrprozessorsystem (DE-588)4038397-0 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Mehrkernprozessor (DE-588)7598578-0 s Speicherhierarchie (DE-588)4256353-7 s Gemeinsamer Speicher (DE-588)4294156-8 s Parallelverarbeitung (DE-588)4075860-6 s Programmierung (DE-588)4076370-5 s Abstraktionsebene (DE-588)4254804-4 s Mehrprozessorsystem (DE-588)4038397-0 s NUMA-Architektur (DE-588)1123829519 s OpenMP (DE-588)4648816-9 s DE-604 Apprimus Verlag Sonstige (DE-588)1068101474 oth DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029153207&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Terboven, Christian Abstractions for performance programming on multi-core architectures with hierarchical memory NUMA-Architektur (DE-588)1123829519 gnd OpenMP (DE-588)4648816-9 gnd Parallelverarbeitung (DE-588)4075860-6 gnd Programmierung (DE-588)4076370-5 gnd Speicherhierarchie (DE-588)4256353-7 gnd Mehrkernprozessor (DE-588)7598578-0 gnd Abstraktionsebene (DE-588)4254804-4 gnd Gemeinsamer Speicher (DE-588)4294156-8 gnd Mehrprozessorsystem (DE-588)4038397-0 gnd |
subject_GND | (DE-588)1123829519 (DE-588)4648816-9 (DE-588)4075860-6 (DE-588)4076370-5 (DE-588)4256353-7 (DE-588)7598578-0 (DE-588)4254804-4 (DE-588)4294156-8 (DE-588)4038397-0 (DE-588)4113937-9 |
title | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_auth | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_exact_search | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_full | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_fullStr | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_full_unstemmed | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_short | Abstractions for performance programming on multi-core architectures with hierarchical memory |
title_sort | abstractions for performance programming on multi core architectures with hierarchical memory |
topic | NUMA-Architektur (DE-588)1123829519 gnd OpenMP (DE-588)4648816-9 gnd Parallelverarbeitung (DE-588)4075860-6 gnd Programmierung (DE-588)4076370-5 gnd Speicherhierarchie (DE-588)4256353-7 gnd Mehrkernprozessor (DE-588)7598578-0 gnd Abstraktionsebene (DE-588)4254804-4 gnd Gemeinsamer Speicher (DE-588)4294156-8 gnd Mehrprozessorsystem (DE-588)4038397-0 gnd |
topic_facet | NUMA-Architektur OpenMP Parallelverarbeitung Programmierung Speicherhierarchie Mehrkernprozessor Abstraktionsebene Gemeinsamer Speicher Mehrprozessorsystem Hochschulschrift |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029153207&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT terbovenchristian abstractionsforperformanceprogrammingonmulticorearchitectureswithhierarchicalmemory AT apprimusverlag abstractionsforperformanceprogrammingonmulticorearchitectureswithhierarchicalmemory |