Compiling data-parallel programs for efficient execution on shared-memory multiprocessors:
Abstract: Data parallelism is well-suited for algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming. However, the fine-grained parallelism characteristic of data-parallel programs makes the efficient implementation of such languages on MIMD m...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Pittsburgh, Pa.
School of Computer Science
1991
|
Schlagworte: | |
Zusammenfassung: | Abstract: Data parallelism is well-suited for algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming. However, the fine-grained parallelism characteristic of data-parallel programs makes the efficient implementation of such languages on MIMD machines a challenging task due to the high overheads these machines incur at small grain sizes. We claim that compile-time analysis can be used to reduce these overheads, thereby allowing data-parallel code to run efficiently on MIMD machines. This dissertation reports on the design, implementation, and evaluation of an optimizing compiler for an applicative nested data-parallel language called VCODE The target machine is the Encore Multimax, a coherent-cache shared-memory multiprocessor. The source language allows nested aggregate data types, and provides a variety of aggregate operations including elementwise forms, scans, reductions, and permutations. Such features greatly expand the range of application that can be cast into a data- parallel model. We present a small set of powerful compile-time techniques that reduce the overheads on MIMD machines in several ways: by increasing the grain size of the output program, by reducing synchronization and storage requirements, and by improving locality of reference The two key ideas behind these optimizations are the symbolic analysis of loop structures, and the hierarchical clustering of the program graph, first by loop structure, and then by loop traversal patterns. This localizes synchronization and work distribution actions to well-defined points in the output code. Loop traversal patterns are then used to identify parallel loops and to eliminate unnecessary intermediate storage. The most significant aspect of the analysis techniques is that they are symbolic in nature and work in the presence of control constructs such as conditionals and recursion. A compiler has been implemented based on these ideas and has been used to compile a large number of benchmarks |
Beschreibung: | Zugl.: Pittsburgh, Pa., Carnegie Mellon Univ., Diss., 1991 |
Beschreibung: | XVIII, 175 S. graph. Darst. |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV006149275 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 930107s1991 d||| m||| 00||| eng d | ||
035 | |a (OCoLC)25502822 | ||
035 | |a (DE-599)BVBBV006149275 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
050 | 0 | |a QA75.58 | |
084 | |a DAT 383d |2 stub | ||
084 | |a DAT 212d |2 stub | ||
084 | |a DAT 516d |2 stub | ||
088 | |a CMU CS 91 189 | ||
100 | 1 | |a Chatterjee, Siddhartha |e Verfasser |4 aut | |
245 | 1 | 0 | |a Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
246 | 1 | 3 | |a CMU CS 91 189 |
264 | 1 | |a Pittsburgh, Pa. |b School of Computer Science |c 1991 | |
300 | |a XVIII, 175 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Zugl.: Pittsburgh, Pa., Carnegie Mellon Univ., Diss., 1991 | ||
520 | 3 | |a Abstract: Data parallelism is well-suited for algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming. However, the fine-grained parallelism characteristic of data-parallel programs makes the efficient implementation of such languages on MIMD machines a challenging task due to the high overheads these machines incur at small grain sizes. We claim that compile-time analysis can be used to reduce these overheads, thereby allowing data-parallel code to run efficiently on MIMD machines. This dissertation reports on the design, implementation, and evaluation of an optimizing compiler for an applicative nested data-parallel language called VCODE | |
520 | 3 | |a The target machine is the Encore Multimax, a coherent-cache shared-memory multiprocessor. The source language allows nested aggregate data types, and provides a variety of aggregate operations including elementwise forms, scans, reductions, and permutations. Such features greatly expand the range of application that can be cast into a data- parallel model. We present a small set of powerful compile-time techniques that reduce the overheads on MIMD machines in several ways: by increasing the grain size of the output program, by reducing synchronization and storage requirements, and by improving locality of reference | |
520 | 3 | |a The two key ideas behind these optimizations are the symbolic analysis of loop structures, and the hierarchical clustering of the program graph, first by loop structure, and then by loop traversal patterns. This localizes synchronization and work distribution actions to well-defined points in the output code. Loop traversal patterns are then used to identify parallel loops and to eliminate unnecessary intermediate storage. The most significant aspect of the analysis techniques is that they are symbolic in nature and work in the presence of control constructs such as conditionals and recursion. A compiler has been implemented based on these ideas and has been used to compile a large number of benchmarks | |
650 | 4 | |a Compilers (Computer programs) | |
650 | 4 | |a Parallel processing (Electronic computers) | |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
999 | |a oai:aleph.bib-bvb.de:BVB01-003888401 |
Datensatz im Suchindex
_version_ | 1804120376909758464 |
---|---|
any_adam_object | |
author | Chatterjee, Siddhartha |
author_facet | Chatterjee, Siddhartha |
author_role | aut |
author_sort | Chatterjee, Siddhartha |
author_variant | s c sc |
building | Verbundindex |
bvnumber | BV006149275 |
callnumber-first | Q - Science |
callnumber-label | QA75 |
callnumber-raw | QA75.58 |
callnumber-search | QA75.58 |
callnumber-sort | QA 275.58 |
callnumber-subject | QA - Mathematics |
classification_tum | DAT 383d DAT 212d DAT 516d |
ctrlnum | (OCoLC)25502822 (DE-599)BVBBV006149275 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03217nam a2200385 c 4500</leader><controlfield tag="001">BV006149275</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">930107s1991 d||| m||| 00||| eng d</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)25502822</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV006149275</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">QA75.58</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 383d</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 212d</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 516d</subfield><subfield code="2">stub</subfield></datafield><datafield tag="088" ind1=" " ind2=" "><subfield code="a">CMU CS 91 189</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chatterjee, Siddhartha</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Compiling data-parallel programs for efficient execution on shared-memory multiprocessors</subfield></datafield><datafield tag="246" ind1="1" ind2="3"><subfield code="a">CMU CS 91 189</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Pittsburgh, Pa.</subfield><subfield code="b">School of Computer Science</subfield><subfield code="c">1991</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XVIII, 175 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Zugl.: Pittsburgh, Pa., Carnegie Mellon Univ., Diss., 1991</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Abstract: Data parallelism is well-suited for algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming. However, the fine-grained parallelism characteristic of data-parallel programs makes the efficient implementation of such languages on MIMD machines a challenging task due to the high overheads these machines incur at small grain sizes. We claim that compile-time analysis can be used to reduce these overheads, thereby allowing data-parallel code to run efficiently on MIMD machines. This dissertation reports on the design, implementation, and evaluation of an optimizing compiler for an applicative nested data-parallel language called VCODE</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">The target machine is the Encore Multimax, a coherent-cache shared-memory multiprocessor. The source language allows nested aggregate data types, and provides a variety of aggregate operations including elementwise forms, scans, reductions, and permutations. Such features greatly expand the range of application that can be cast into a data- parallel model. We present a small set of powerful compile-time techniques that reduce the overheads on MIMD machines in several ways: by increasing the grain size of the output program, by reducing synchronization and storage requirements, and by improving locality of reference</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">The two key ideas behind these optimizations are the symbolic analysis of loop structures, and the hierarchical clustering of the program graph, first by loop structure, and then by loop traversal patterns. This localizes synchronization and work distribution actions to well-defined points in the output code. Loop traversal patterns are then used to identify parallel loops and to eliminate unnecessary intermediate storage. The most significant aspect of the analysis techniques is that they are symbolic in nature and work in the presence of control constructs such as conditionals and recursion. A compiler has been implemented based on these ideas and has been used to compile a large number of benchmarks</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Compilers (Computer programs)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parallel processing (Electronic computers)</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-003888401</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV006149275 |
illustrated | Illustrated |
indexdate | 2024-07-09T16:41:07Z |
institution | BVB |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-003888401 |
oclc_num | 25502822 |
open_access_boolean | |
physical | XVIII, 175 S. graph. Darst. |
publishDate | 1991 |
publishDateSearch | 1991 |
publishDateSort | 1991 |
publisher | School of Computer Science |
record_format | marc |
spelling | Chatterjee, Siddhartha Verfasser aut Compiling data-parallel programs for efficient execution on shared-memory multiprocessors CMU CS 91 189 Pittsburgh, Pa. School of Computer Science 1991 XVIII, 175 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Zugl.: Pittsburgh, Pa., Carnegie Mellon Univ., Diss., 1991 Abstract: Data parallelism is well-suited for algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming. However, the fine-grained parallelism characteristic of data-parallel programs makes the efficient implementation of such languages on MIMD machines a challenging task due to the high overheads these machines incur at small grain sizes. We claim that compile-time analysis can be used to reduce these overheads, thereby allowing data-parallel code to run efficiently on MIMD machines. This dissertation reports on the design, implementation, and evaluation of an optimizing compiler for an applicative nested data-parallel language called VCODE The target machine is the Encore Multimax, a coherent-cache shared-memory multiprocessor. The source language allows nested aggregate data types, and provides a variety of aggregate operations including elementwise forms, scans, reductions, and permutations. Such features greatly expand the range of application that can be cast into a data- parallel model. We present a small set of powerful compile-time techniques that reduce the overheads on MIMD machines in several ways: by increasing the grain size of the output program, by reducing synchronization and storage requirements, and by improving locality of reference The two key ideas behind these optimizations are the symbolic analysis of loop structures, and the hierarchical clustering of the program graph, first by loop structure, and then by loop traversal patterns. This localizes synchronization and work distribution actions to well-defined points in the output code. Loop traversal patterns are then used to identify parallel loops and to eliminate unnecessary intermediate storage. The most significant aspect of the analysis techniques is that they are symbolic in nature and work in the presence of control constructs such as conditionals and recursion. A compiler has been implemented based on these ideas and has been used to compile a large number of benchmarks Compilers (Computer programs) Parallel processing (Electronic computers) (DE-588)4113937-9 Hochschulschrift gnd-content |
spellingShingle | Chatterjee, Siddhartha Compiling data-parallel programs for efficient execution on shared-memory multiprocessors Compilers (Computer programs) Parallel processing (Electronic computers) |
subject_GND | (DE-588)4113937-9 |
title | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_alt | CMU CS 91 189 |
title_auth | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_exact_search | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_full | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_fullStr | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_full_unstemmed | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_short | Compiling data-parallel programs for efficient execution on shared-memory multiprocessors |
title_sort | compiling data parallel programs for efficient execution on shared memory multiprocessors |
topic | Compilers (Computer programs) Parallel processing (Electronic computers) |
topic_facet | Compilers (Computer programs) Parallel processing (Electronic computers) Hochschulschrift |
work_keys_str_mv | AT chatterjeesiddhartha compilingdataparallelprogramsforefficientexecutiononsharedmemorymultiprocessors AT chatterjeesiddhartha cmucs91189 |