Compiling Parallel Loops for High Performance Computers: Partitioning, Data Assignment and Remapping
4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE L...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Boston, MA
Springer US
1993
|
Schriftenreihe: | The Kluwer International Series in Engineering and Computer Science
200 |
Schlagworte: | |
Online-Zugang: | BTU01 Volltext |
Zusammenfassung: | 4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes |
Beschreibung: | 1 Online-Ressource (XV, 159 p) |
ISBN: | 9781461531647 |
DOI: | 10.1007/978-1-4615-3164-7 |
Internformat
MARC
LEADER | 00000nmm a2200000zcb4500 | ||
---|---|---|---|
001 | BV045186532 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | cr|uuu---uuuuu | ||
008 | 180912s1993 |||| o||u| ||||||eng d | ||
020 | |a 9781461531647 |9 978-1-4615-3164-7 | ||
024 | 7 | |a 10.1007/978-1-4615-3164-7 |2 doi | |
035 | |a (ZDB-2-ENG)978-1-4615-3164-7 | ||
035 | |a (OCoLC)1053793234 | ||
035 | |a (DE-599)BVBBV045186532 | ||
040 | |a DE-604 |b ger |e aacr | ||
041 | 0 | |a eng | |
049 | |a DE-634 | ||
082 | 0 | |a 004.1 |2 23 | |
100 | 1 | |a Hudak, David E. |e Verfasser |4 aut | |
245 | 1 | 0 | |a Compiling Parallel Loops for High Performance Computers |b Partitioning, Data Assignment and Remapping |c by David E. Hudak, Santosh G. Abraham |
264 | 1 | |a Boston, MA |b Springer US |c 1993 | |
300 | |a 1 Online-Ressource (XV, 159 p) | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 0 | |a The Kluwer International Series in Engineering and Computer Science |v 200 | |
520 | |a 4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes | ||
650 | 4 | |a Computer Science | |
650 | 4 | |a Processor Architectures | |
650 | 4 | |a Computer science | |
650 | 4 | |a Microprocessors | |
700 | 1 | |a Abraham, Santosh G. |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9781461363866 |
856 | 4 | 0 | |u https://doi.org/10.1007/978-1-4615-3164-7 |x Verlag |z URL des Erstveröffentlichers |3 Volltext |
912 | |a ZDB-2-ENG | ||
940 | 1 | |q ZDB-2-ENG_Archiv | |
999 | |a oai:aleph.bib-bvb.de:BVB01-030575709 | ||
966 | e | |u https://doi.org/10.1007/978-1-4615-3164-7 |l BTU01 |p ZDB-2-ENG |q ZDB-2-ENG_Archiv |x Verlag |3 Volltext |
Datensatz im Suchindex
_version_ | 1804178877627498496 |
---|---|
any_adam_object | |
author | Hudak, David E. Abraham, Santosh G. |
author_facet | Hudak, David E. Abraham, Santosh G. |
author_role | aut aut |
author_sort | Hudak, David E. |
author_variant | d e h de deh s g a sg sga |
building | Verbundindex |
bvnumber | BV045186532 |
collection | ZDB-2-ENG |
ctrlnum | (ZDB-2-ENG)978-1-4615-3164-7 (OCoLC)1053793234 (DE-599)BVBBV045186532 |
dewey-full | 004.1 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 004 - Computer science |
dewey-raw | 004.1 |
dewey-search | 004.1 |
dewey-sort | 14.1 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
doi_str_mv | 10.1007/978-1-4615-3164-7 |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03085nmm a2200433zcb4500</leader><controlfield tag="001">BV045186532</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">180912s1993 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781461531647</subfield><subfield code="9">978-1-4615-3164-7</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-1-4615-3164-7</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-2-ENG)978-1-4615-3164-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1053793234</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045186532</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-634</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004.1</subfield><subfield code="2">23</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Hudak, David E.</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Compiling Parallel Loops for High Performance Computers</subfield><subfield code="b">Partitioning, Data Assignment and Remapping</subfield><subfield code="c">by David E. Hudak, Santosh G. Abraham</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boston, MA</subfield><subfield code="b">Springer US</subfield><subfield code="c">1993</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (XV, 159 p)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The Kluwer International Series in Engineering and Computer Science</subfield><subfield code="v">200</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Processor Architectures</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Microprocessors</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Abraham, Santosh G.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781461363866</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1007/978-1-4615-3164-7</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-2-ENG</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">ZDB-2-ENG_Archiv</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030575709</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1007/978-1-4615-3164-7</subfield><subfield code="l">BTU01</subfield><subfield code="p">ZDB-2-ENG</subfield><subfield code="q">ZDB-2-ENG_Archiv</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV045186532 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T08:10:57Z |
institution | BVB |
isbn | 9781461531647 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030575709 |
oclc_num | 1053793234 |
open_access_boolean | |
owner | DE-634 |
owner_facet | DE-634 |
physical | 1 Online-Ressource (XV, 159 p) |
psigel | ZDB-2-ENG ZDB-2-ENG_Archiv ZDB-2-ENG ZDB-2-ENG_Archiv |
publishDate | 1993 |
publishDateSearch | 1993 |
publishDateSort | 1993 |
publisher | Springer US |
record_format | marc |
series2 | The Kluwer International Series in Engineering and Computer Science |
spelling | Hudak, David E. Verfasser aut Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham Boston, MA Springer US 1993 1 Online-Ressource (XV, 159 p) txt rdacontent c rdamedia cr rdacarrier The Kluwer International Series in Engineering and Computer Science 200 4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes Computer Science Processor Architectures Computer science Microprocessors Abraham, Santosh G. aut Erscheint auch als Druck-Ausgabe 9781461363866 https://doi.org/10.1007/978-1-4615-3164-7 Verlag URL des Erstveröffentlichers Volltext |
spellingShingle | Hudak, David E. Abraham, Santosh G. Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping Computer Science Processor Architectures Computer science Microprocessors |
title | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping |
title_auth | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping |
title_exact_search | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping |
title_full | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham |
title_fullStr | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham |
title_full_unstemmed | Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham |
title_short | Compiling Parallel Loops for High Performance Computers |
title_sort | compiling parallel loops for high performance computers partitioning data assignment and remapping |
title_sub | Partitioning, Data Assignment and Remapping |
topic | Computer Science Processor Architectures Computer science Microprocessors |
topic_facet | Computer Science Processor Architectures Computer science Microprocessors |
url | https://doi.org/10.1007/978-1-4615-3164-7 |
work_keys_str_mv | AT hudakdavide compilingparallelloopsforhighperformancecomputerspartitioningdataassignmentandremapping AT abrahamsantoshg compilingparallelloopsforhighperformancecomputerspartitioningdataassignmentandremapping |