Design and evaluation of the Hamal parallel computer

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2003.

Bibliographic Details
Main Author: Grossman, J. P., 1973-
Other Authors: Thomas F. Knight, Jr.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1721.1/16909
_version_ 1826206464845283328
author Grossman, J. P., 1973-
author2 Thomas F. Knight, Jr.
author_facet Thomas F. Knight, Jr.
Grossman, J. P., 1973-
author_sort Grossman, J. P., 1973-
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2003.
first_indexed 2024-09-23T13:31:46Z
format Thesis
id mit-1721.1/16909
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T13:31:46Z
publishDate 2005
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/169092019-04-12T08:50:30Z Design and evaluation of the Hamal parallel computer Grossman, J. P., 1973- Thomas F. Knight, Jr. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2003. "December 2002." Includes bibliographical references (p. 145-152). This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization. (cont.) Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine. by J.B. Grossman. Ph.D. 2005-05-19T15:15:01Z 2005-05-19T15:15:01Z 2003 Thesis http://hdl.handle.net/1721.1/16909 52575362 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 152 p. 5345143 bytes 5344524 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Grossman, J. P., 1973-
Design and evaluation of the Hamal parallel computer
title Design and evaluation of the Hamal parallel computer
title_full Design and evaluation of the Hamal parallel computer
title_fullStr Design and evaluation of the Hamal parallel computer
title_full_unstemmed Design and evaluation of the Hamal parallel computer
title_short Design and evaluation of the Hamal parallel computer
title_sort design and evaluation of the hamal parallel computer
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/16909
work_keys_str_mv AT grossmanjp1973 designandevaluationofthehamalparallelcomputer