Optimizing hardware granularity in parallel systems:
Abstract: "In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this 'divide and conquer' strategy. Whether o...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
Edinburgh
University of Edinburgh, Dept. of Computer Science
[1995]
|
Schlagworte: | |
Zusammenfassung: | Abstract: "In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this 'divide and conquer' strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost: performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of grain of hardware versus speed of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time T of an algorithm on a machine as a function of the number of nodes, N. The model is used to examine the degree to which it is possible to obtain an optimal value of N, corresponding to minimum execution time. Specifically, the optimization is done assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multiprocessor and a message-passing multicomputer. The former is represented by a simple shared-bus multiprocessor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two- dimensional mesh-connected multicomputer. In this type of system all memory is considered private and data sharing is carried out using 'messages' explicitly passed among the PEs." |
Beschreibung: | xii, 172 p. ill. 21 cm |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV035045110 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 080909s1995 a||| m||| 00||| eng d | ||
035 | |a (OCoLC)37433979 | ||
035 | |a (DE-599)BVBBV035045110 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-91G | ||
088 | |a CST-123-95 | ||
100 | 0 | |a Kelly Thomas |e Verfasser |4 aut | |
245 | 1 | 0 | |a Optimizing hardware granularity in parallel systems |c Thomas Kelly |
264 | 1 | |a Edinburgh |b University of Edinburgh, Dept. of Computer Science |c [1995] | |
300 | |a xii, 172 p. |b ill. |c 21 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
502 | |a Thesis (Ph. D.)--University of Edinburgh, 1995 | ||
520 | 3 | |a Abstract: "In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this 'divide and conquer' strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost: performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of grain of hardware versus speed of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time T of an algorithm on a machine as a function of the number of nodes, N. The model is used to examine the degree to which it is possible to obtain an optimal value of N, corresponding to minimum execution time. Specifically, the optimization is done assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multiprocessor and a message-passing multicomputer. The former is represented by a simple shared-bus multiprocessor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two- dimensional mesh-connected multicomputer. In this type of system all memory is considered private and data sharing is carried out using 'messages' explicitly passed among the PEs." | |
650 | 4 | |a Computer architecture | |
650 | 4 | |a Parallel processing (Electronic computers) | |
650 | 4 | |a Computer architecture | |
650 | 4 | |a Parallel processing (Electronic computers) | |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
999 | |a oai:aleph.bib-bvb.de:BVB01-016713885 |
Datensatz im Suchindex
_version_ | 1804137981609508864 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author | Kelly Thomas |
author_facet | Kelly Thomas |
author_role | aut |
author_sort | Kelly Thomas |
author_variant | k t kt |
building | Verbundindex |
bvnumber | BV035045110 |
ctrlnum | (OCoLC)37433979 (DE-599)BVBBV035045110 |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02669nam a2200337 c 4500</leader><controlfield tag="001">BV035045110</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">080909s1995 a||| m||| 00||| eng d</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)37433979</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV035045110</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91G</subfield></datafield><datafield tag="088" ind1=" " ind2=" "><subfield code="a">CST-123-95</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Kelly Thomas</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Optimizing hardware granularity in parallel systems</subfield><subfield code="c">Thomas Kelly</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Edinburgh</subfield><subfield code="b">University of Edinburgh, Dept. of Computer Science</subfield><subfield code="c">[1995]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xii, 172 p.</subfield><subfield code="b">ill.</subfield><subfield code="c">21 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Thesis (Ph. D.)--University of Edinburgh, 1995</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Abstract: "In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this 'divide and conquer' strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost: performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of grain of hardware versus speed of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time T of an algorithm on a machine as a function of the number of nodes, N. The model is used to examine the degree to which it is possible to obtain an optimal value of N, corresponding to minimum execution time. Specifically, the optimization is done assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multiprocessor and a message-passing multicomputer. The former is represented by a simple shared-bus multiprocessor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two- dimensional mesh-connected multicomputer. In this type of system all memory is considered private and data sharing is carried out using 'messages' explicitly passed among the PEs."</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer architecture</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parallel processing (Electronic computers)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer architecture</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parallel processing (Electronic computers)</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-016713885</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV035045110 |
illustrated | Illustrated |
index_date | 2024-07-02T21:54:28Z |
indexdate | 2024-07-09T21:20:56Z |
institution | BVB |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-016713885 |
oclc_num | 37433979 |
open_access_boolean | |
owner | DE-91G DE-BY-TUM |
owner_facet | DE-91G DE-BY-TUM |
physical | xii, 172 p. ill. 21 cm |
publishDate | 1995 |
publishDateSearch | 1995 |
publishDateSort | 1995 |
publisher | University of Edinburgh, Dept. of Computer Science |
record_format | marc |
spelling | Kelly Thomas Verfasser aut Optimizing hardware granularity in parallel systems Thomas Kelly Edinburgh University of Edinburgh, Dept. of Computer Science [1995] xii, 172 p. ill. 21 cm txt rdacontent n rdamedia nc rdacarrier Thesis (Ph. D.)--University of Edinburgh, 1995 Abstract: "In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this 'divide and conquer' strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost: performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of grain of hardware versus speed of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time T of an algorithm on a machine as a function of the number of nodes, N. The model is used to examine the degree to which it is possible to obtain an optimal value of N, corresponding to minimum execution time. Specifically, the optimization is done assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multiprocessor and a message-passing multicomputer. The former is represented by a simple shared-bus multiprocessor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two- dimensional mesh-connected multicomputer. In this type of system all memory is considered private and data sharing is carried out using 'messages' explicitly passed among the PEs." Computer architecture Parallel processing (Electronic computers) (DE-588)4113937-9 Hochschulschrift gnd-content |
spellingShingle | Kelly Thomas Optimizing hardware granularity in parallel systems Computer architecture Parallel processing (Electronic computers) |
subject_GND | (DE-588)4113937-9 |
title | Optimizing hardware granularity in parallel systems |
title_auth | Optimizing hardware granularity in parallel systems |
title_exact_search | Optimizing hardware granularity in parallel systems |
title_exact_search_txtP | Optimizing hardware granularity in parallel systems |
title_full | Optimizing hardware granularity in parallel systems Thomas Kelly |
title_fullStr | Optimizing hardware granularity in parallel systems Thomas Kelly |
title_full_unstemmed | Optimizing hardware granularity in parallel systems Thomas Kelly |
title_short | Optimizing hardware granularity in parallel systems |
title_sort | optimizing hardware granularity in parallel systems |
topic | Computer architecture Parallel processing (Electronic computers) |
topic_facet | Computer architecture Parallel processing (Electronic computers) Hochschulschrift |
work_keys_str_mv | AT kellythomas optimizinghardwaregranularityinparallelsystems |