Sorting large files on a backend multiprocessor:

A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity and I/O bandwidth. In this paper, we investi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Beck, Micah (VerfasserIn), Bitton, Dina (VerfasserIn), Wilkinson, William K. (VerfasserIn)
Format: Buch
Sprache:English
Veröffentlicht: Ithaca, New York 1986
Schriftenreihe:Cornell University <Ithaca, NY> / Department of Computer Science: Technical report 741
Schlagworte:
Zusammenfassung:A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity and I/O bandwidth. In this paper, we investigate the feasibility and efficiency of a parallel sort-merge algorithm through implementation on the JASMIN prototype, a backend multiprocessor built around a fast packet bus. We describe the design and implementation of a parallel sort utility that may become a building block for query processing in a database system that runs on JASMIN. We present and analyze the results of measurements corresponding to a range of file sizes and processor configurations. Our results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three and five microprocessor configurations provide a very cost-effective sort of large files. The three processor configuration sorts a 100 megabyte file in one hour, which compares well with commercial sort packages available on high-performance mainframes. In additional experiments, we investigate a model to tune our sort software, and scale our results to higher processor and network capabilities.
Beschreibung:27 S.