Verfügbarkeit: A quantitative performance evaluation of SCI memory hierarchies

A quantitative performance evaluation of SCI memory hierarchies:

Abstract: "The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a comm...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Hexsel, Roberto A. (VerfasserIn)
Format:	Abschlussarbeit Buch
Sprache:	English
Veröffentlicht:	Edinburgh University of Edinburgh, Dept. of Computer Science [1994]
Schlagworte:	Computer stage devices > Evaluation Memory hierarchy (Computer science) Multiprocessors > Evaluation Hochschulschrift
Zusammenfassung:	Abstract: "The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a communication protocol based on unidirectional point to point links. The third defines a cache coherence protocol based on a full directory that is distributed amongst the cache and memory modules. The cache controllers keep track of the copies of a given datum by maintaining them in a doubly linked list. SCI can scale up to 65520 nodes. This dissertation contains a quantitative performance evaluation of an SCI-connected multiprocessor that assesses both the communication and cache coherence subsystems. The simulator is driven by reference streams generated as a by-product of the execution of 'real' programs. The workload consists of three programs from the SPLASH suite and three parallel loops The simplest topology supported by SCI is the ring. It was found that, for the hardware and software simulated, the largest efficient ring size is between eight and sixteen nodes and that raw network bandwidth seen by processing elements is limited at about 80Mbytes/s. This is because the network saturates when link traffic reaches 600- 7000Mbytes/s. These levels of link traffic only occur for two poorly designed programs. The other four programs generate low traffic and their execution speed is not limited by interconnect nor cache coherence protocol. An analytical model of the multiprocessor is used to assess the cost of some frequently occurring cache coherence protocol operations. In order to build large systems, networks more sophisticated than rings must be used. The performance of SCI meshes and cubes is evaluated for systems of up to 64 nodes. As with rings, processor throughput is also limited by link traffic for the same two poorly designed programs Cubes are 10-15% faster than meshes for programs that generate high levels of network traffic. Otherwise, the differences are negligble. No significant relationship between cache size and network dimensionality was found.
Beschreibung:	viii, 148 p. ill. 21 cm

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV035044980
003	DE-604
005	00000000000000.0
007	t
008	080909s1994 a\|\|\| m\|\|\| 00\|\|\| eng d
035			\|a (OCoLC)36679662
035			\|a (DE-599)BVBBV035044980
040			\|a DE-604 \|b ger \|e rakwb
041	0		\|a eng
049			\|a DE-91G
088			\|a CST-112-94
100	1		\|a Hexsel, Roberto A. \|e Verfasser \|4 aut
245	1	0	\|a A quantitative performance evaluation of SCI memory hierarchies \|c Roberto A. Hexsel
264		1	\|a Edinburgh \|b University of Edinburgh, Dept. of Computer Science \|c [1994]
300			\|a viii, 148 p. \|b ill. \|c 21 cm
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
502			\|a Thesis (Ph. D.)--University of Edinburgh, 1994
520	3		\|a Abstract: "The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a communication protocol based on unidirectional point to point links. The third defines a cache coherence protocol based on a full directory that is distributed amongst the cache and memory modules. The cache controllers keep track of the copies of a given datum by maintaining them in a doubly linked list. SCI can scale up to 65520 nodes. This dissertation contains a quantitative performance evaluation of an SCI-connected multiprocessor that assesses both the communication and cache coherence subsystems. The simulator is driven by reference streams generated as a by-product of the execution of 'real' programs. The workload consists of three programs from the SPLASH suite and three parallel loops
520	3		\|a The simplest topology supported by SCI is the ring. It was found that, for the hardware and software simulated, the largest efficient ring size is between eight and sixteen nodes and that raw network bandwidth seen by processing elements is limited at about 80Mbytes/s. This is because the network saturates when link traffic reaches 600- 7000Mbytes/s. These levels of link traffic only occur for two poorly designed programs. The other four programs generate low traffic and their execution speed is not limited by interconnect nor cache coherence protocol. An analytical model of the multiprocessor is used to assess the cost of some frequently occurring cache coherence protocol operations. In order to build large systems, networks more sophisticated than rings must be used. The performance of SCI meshes and cubes is evaluated for systems of up to 64 nodes. As with rings, processor throughput is also limited by link traffic for the same two poorly designed programs
520	3		\|a Cubes are 10-15% faster than meshes for programs that generate high levels of network traffic. Otherwise, the differences are negligble. No significant relationship between cache size and network dimensionality was found.
650		4	\|a Computer stage devices \|x Evaluation
650		4	\|a Memory hierarchy (Computer science)
650		4	\|a Multiprocessors \|x Evaluation
655		7	\|0 (DE-588)4113937-9 \|a Hochschulschrift \|2 gnd-content
999			\|a oai:aleph.bib-bvb.de:BVB01-016713763

Datensatz im Suchindex

_version_	1804137981442785280
adam_txt
any_adam_object
any_adam_object_boolean
author	Hexsel, Roberto A.
author_facet	Hexsel, Roberto A.
author_role	aut
author_sort	Hexsel, Roberto A.
author_variant	r a h ra rah
building	Verbundindex
bvnumber	BV035044980
ctrlnum	(OCoLC)36679662 (DE-599)BVBBV035044980
format	Thesis Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03281nam a2200349 c 4500</leader><controlfield tag="001">BV035044980</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">080909s1994 a\|\|\| m\|\|\| 00\|\|\| eng d</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)36679662</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV035044980</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91G</subfield></datafield><datafield tag="088" ind1=" " ind2=" "><subfield code="a">CST-112-94</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Hexsel, Roberto A.</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">A quantitative performance evaluation of SCI memory hierarchies</subfield><subfield code="c">Roberto A. Hexsel</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Edinburgh</subfield><subfield code="b">University of Edinburgh, Dept. of Computer Science</subfield><subfield code="c">[1994]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">viii, 148 p.</subfield><subfield code="b">ill.</subfield><subfield code="c">21 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Thesis (Ph. D.)--University of Edinburgh, 1994</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Abstract: "The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a communication protocol based on unidirectional point to point links. The third defines a cache coherence protocol based on a full directory that is distributed amongst the cache and memory modules. The cache controllers keep track of the copies of a given datum by maintaining them in a doubly linked list. SCI can scale up to 65520 nodes. This dissertation contains a quantitative performance evaluation of an SCI-connected multiprocessor that assesses both the communication and cache coherence subsystems. The simulator is driven by reference streams generated as a by-product of the execution of 'real' programs. The workload consists of three programs from the SPLASH suite and three parallel loops</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">The simplest topology supported by SCI is the ring. It was found that, for the hardware and software simulated, the largest efficient ring size is between eight and sixteen nodes and that raw network bandwidth seen by processing elements is limited at about 80Mbytes/s. This is because the network saturates when link traffic reaches 600- 7000Mbytes/s. These levels of link traffic only occur for two poorly designed programs. The other four programs generate low traffic and their execution speed is not limited by interconnect nor cache coherence protocol. An analytical model of the multiprocessor is used to assess the cost of some frequently occurring cache coherence protocol operations. In order to build large systems, networks more sophisticated than rings must be used. The performance of SCI meshes and cubes is evaluated for systems of up to 64 nodes. As with rings, processor throughput is also limited by link traffic for the same two poorly designed programs</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Cubes are 10-15% faster than meshes for programs that generate high levels of network traffic. Otherwise, the differences are negligble. No significant relationship between cache size and network dimensionality was found.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer stage devices</subfield><subfield code="x">Evaluation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Memory hierarchy (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Multiprocessors</subfield><subfield code="x">Evaluation</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-016713763</subfield></datafield></record></collection>
genre	(DE-588)4113937-9 Hochschulschrift gnd-content
genre_facet	Hochschulschrift
id	DE-604.BV035044980
illustrated	Illustrated
index_date	2024-07-02T21:54:26Z
indexdate	2024-07-09T21:20:56Z
institution	BVB
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-016713763
oclc_num	36679662
open_access_boolean
owner	DE-91G DE-BY-TUM
owner_facet	DE-91G DE-BY-TUM
physical	viii, 148 p. ill. 21 cm
publishDate	1994
publishDateSearch	1994
publishDateSort	1994
publisher	University of Edinburgh, Dept. of Computer Science
record_format	marc
spelling	Hexsel, Roberto A. Verfasser aut A quantitative performance evaluation of SCI memory hierarchies Roberto A. Hexsel Edinburgh University of Edinburgh, Dept. of Computer Science [1994] viii, 148 p. ill. 21 cm txt rdacontent n rdamedia nc rdacarrier Thesis (Ph. D.)--University of Edinburgh, 1994 Abstract: "The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a communication protocol based on unidirectional point to point links. The third defines a cache coherence protocol based on a full directory that is distributed amongst the cache and memory modules. The cache controllers keep track of the copies of a given datum by maintaining them in a doubly linked list. SCI can scale up to 65520 nodes. This dissertation contains a quantitative performance evaluation of an SCI-connected multiprocessor that assesses both the communication and cache coherence subsystems. The simulator is driven by reference streams generated as a by-product of the execution of 'real' programs. The workload consists of three programs from the SPLASH suite and three parallel loops The simplest topology supported by SCI is the ring. It was found that, for the hardware and software simulated, the largest efficient ring size is between eight and sixteen nodes and that raw network bandwidth seen by processing elements is limited at about 80Mbytes/s. This is because the network saturates when link traffic reaches 600- 7000Mbytes/s. These levels of link traffic only occur for two poorly designed programs. The other four programs generate low traffic and their execution speed is not limited by interconnect nor cache coherence protocol. An analytical model of the multiprocessor is used to assess the cost of some frequently occurring cache coherence protocol operations. In order to build large systems, networks more sophisticated than rings must be used. The performance of SCI meshes and cubes is evaluated for systems of up to 64 nodes. As with rings, processor throughput is also limited by link traffic for the same two poorly designed programs Cubes are 10-15% faster than meshes for programs that generate high levels of network traffic. Otherwise, the differences are negligble. No significant relationship between cache size and network dimensionality was found. Computer stage devices Evaluation Memory hierarchy (Computer science) Multiprocessors Evaluation (DE-588)4113937-9 Hochschulschrift gnd-content
spellingShingle	Hexsel, Roberto A. A quantitative performance evaluation of SCI memory hierarchies Computer stage devices Evaluation Memory hierarchy (Computer science) Multiprocessors Evaluation
subject_GND	(DE-588)4113937-9
title	A quantitative performance evaluation of SCI memory hierarchies
title_auth	A quantitative performance evaluation of SCI memory hierarchies
title_exact_search	A quantitative performance evaluation of SCI memory hierarchies
title_exact_search_txtP	A quantitative performance evaluation of SCI memory hierarchies
title_full	A quantitative performance evaluation of SCI memory hierarchies Roberto A. Hexsel
title_fullStr	A quantitative performance evaluation of SCI memory hierarchies Roberto A. Hexsel
title_full_unstemmed	A quantitative performance evaluation of SCI memory hierarchies Roberto A. Hexsel
title_short	A quantitative performance evaluation of SCI memory hierarchies
title_sort	a quantitative performance evaluation of sci memory hierarchies
topic	Computer stage devices Evaluation Memory hierarchy (Computer science) Multiprocessors Evaluation
topic_facet	Computer stage devices Evaluation Memory hierarchy (Computer science) Multiprocessors Evaluation Hochschulschrift
work_keys_str_mv	AT hexselrobertoa aquantitativeperformanceevaluationofscimemoryhierarchies

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand!

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge