Verfügbarkeit: Compiling Parallel Loops for High Performance Computers

Compiling Parallel Loops for High Performance Computers: Partitioning, Data Assignment and Remapping

4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE L...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hudak, David E. (VerfasserIn), Abraham, Santosh G. (VerfasserIn)
Format:	Elektronisch E-Book
Sprache:	English
Veröffentlicht:	Boston, MA Springer US 1993
Schriftenreihe:	The Kluwer International Series in Engineering and Computer Science 200
Schlagworte:	Computer Science Processor Architectures Computer science Microprocessors
Online-Zugang:	BTU01 URL des Erstveröffentlichers
Zusammenfassung:	4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes
Beschreibung:	1 Online-Ressource (XV, 159 p)
ISBN:	9781461531647
DOI:	10.1007/978-1-4615-3164-7

Internformat

MARC


LEADER	00000nmm a2200000zcb4500
001	BV045186532
003	DE-604
005	00000000000000.0
007	cr\|uuu---uuuuu
008	180912s1993 \|\|\|\| o\|\|u\| \|\|\|\|\|\|eng d
020			\|a 9781461531647 \|9 978-1-4615-3164-7
024	7		\|a 10.1007/978-1-4615-3164-7 \|2 doi
035			\|a (ZDB-2-ENG)978-1-4615-3164-7
035			\|a (OCoLC)1053793234
035			\|a (DE-599)BVBBV045186532
040			\|a DE-604 \|b ger \|e aacr
041	0		\|a eng
049			\|a DE-634
082	0		\|a 004.1 \|2 23
100	1		\|a Hudak, David E. \|e Verfasser \|4 aut
245	1	0	\|a Compiling Parallel Loops for High Performance Computers \|b Partitioning, Data Assignment and Remapping \|c by David E. Hudak, Santosh G. Abraham
264		1	\|a Boston, MA \|b Springer US \|c 1993
300			\|a 1 Online-Ressource (XV, 159 p)
336			\|b txt \|2 rdacontent
337			\|b c \|2 rdamedia
338			\|b cr \|2 rdacarrier
490	0		\|a The Kluwer International Series in Engineering and Computer Science \|v 200
520			\|a 4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes
650		4	\|a Computer Science
650		4	\|a Processor Architectures
650		4	\|a Computer science
650		4	\|a Microprocessors
700	1		\|a Abraham, Santosh G. \|4 aut
776	0	8	\|i Erscheint auch als \|n Druck-Ausgabe \|z 9781461363866
856	4	0	\|u https://doi.org/10.1007/978-1-4615-3164-7 \|x Verlag \|z URL des Erstveröffentlichers \|3 Volltext
912			\|a ZDB-2-ENG
940	1		\|q ZDB-2-ENG_Archiv
999			\|a oai:aleph.bib-bvb.de:BVB01-030575709
966	e		\|u https://doi.org/10.1007/978-1-4615-3164-7 \|l BTU01 \|p ZDB-2-ENG \|q ZDB-2-ENG_Archiv \|x Verlag \|3 Volltext

Datensatz im Suchindex

_version_	1804178877627498496
any_adam_object
author	Hudak, David E. Abraham, Santosh G.
author_facet	Hudak, David E. Abraham, Santosh G.
author_role	aut aut
author_sort	Hudak, David E.
author_variant	d e h de deh s g a sg sga
building	Verbundindex
bvnumber	BV045186532
collection	ZDB-2-ENG
ctrlnum	(ZDB-2-ENG)978-1-4615-3164-7 (OCoLC)1053793234 (DE-599)BVBBV045186532
dewey-full	004.1
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	004 - Computer science
dewey-raw	004.1
dewey-search	004.1
dewey-sort	14.1
dewey-tens	000 - Computer science, information, general works
discipline	Informatik
doi_str_mv	10.1007/978-1-4615-3164-7
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03085nmm a2200433zcb4500</leader><controlfield tag="001">BV045186532</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">cr\|uuu---uuuuu</controlfield><controlfield tag="008">180912s1993 \|\|\|\| o\|\|u\| \|\|\|\|\|\|eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781461531647</subfield><subfield code="9">978-1-4615-3164-7</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-1-4615-3164-7</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-2-ENG)978-1-4615-3164-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1053793234</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045186532</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-634</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004.1</subfield><subfield code="2">23</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Hudak, David E.</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Compiling Parallel Loops for High Performance Computers</subfield><subfield code="b">Partitioning, Data Assignment and Remapping</subfield><subfield code="c">by David E. Hudak, Santosh G. Abraham</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boston, MA</subfield><subfield code="b">Springer US</subfield><subfield code="c">1993</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (XV, 159 p)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The Kluwer International Series in Engineering and Computer Science</subfield><subfield code="v">200</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Processor Architectures</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Microprocessors</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Abraham, Santosh G.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781461363866</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1007/978-1-4615-3164-7</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-2-ENG</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">ZDB-2-ENG_Archiv</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030575709</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1007/978-1-4615-3164-7</subfield><subfield code="l">BTU01</subfield><subfield code="p">ZDB-2-ENG</subfield><subfield code="q">ZDB-2-ENG_Archiv</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection>
id	DE-604.BV045186532
illustrated	Not Illustrated
indexdate	2024-07-10T08:10:57Z
institution	BVB
isbn	9781461531647
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-030575709
oclc_num	1053793234
open_access_boolean
owner	DE-634
owner_facet	DE-634
physical	1 Online-Ressource (XV, 159 p)
psigel	ZDB-2-ENG ZDB-2-ENG_Archiv ZDB-2-ENG ZDB-2-ENG_Archiv
publishDate	1993
publishDateSearch	1993
publishDateSort	1993
publisher	Springer US
record_format	marc
series2	The Kluwer International Series in Engineering and Computer Science
spelling	Hudak, David E. Verfasser aut Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham Boston, MA Springer US 1993 1 Online-Ressource (XV, 159 p) txt rdacontent c rdamedia cr rdacarrier The Kluwer International Series in Engineering and Computer Science 200 4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes Computer Science Processor Architectures Computer science Microprocessors Abraham, Santosh G. aut Erscheint auch als Druck-Ausgabe 9781461363866 https://doi.org/10.1007/978-1-4615-3164-7 Verlag URL des Erstveröffentlichers Volltext
spellingShingle	Hudak, David E. Abraham, Santosh G. Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping Computer Science Processor Architectures Computer science Microprocessors
title	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping
title_auth	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping
title_exact_search	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping
title_full	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham
title_fullStr	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham
title_full_unstemmed	Compiling Parallel Loops for High Performance Computers Partitioning, Data Assignment and Remapping by David E. Hudak, Santosh G. Abraham
title_short	Compiling Parallel Loops for High Performance Computers
title_sort	compiling parallel loops for high performance computers partitioning data assignment and remapping
title_sub	Partitioning, Data Assignment and Remapping
topic	Computer Science Processor Architectures Computer science Microprocessors
topic_facet	Computer Science Processor Architectures Computer science Microprocessors
url	https://doi.org/10.1007/978-1-4615-3164-7
work_keys_str_mv	AT hudakdavide compilingparallelloopsforhighperformancecomputerspartitioningdataassignmentandremapping AT abrahamsantoshg compilingparallelloopsforhighperformancecomputerspartitioningdataassignmentandremapping

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Volltext öffnen

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge