A Hierarchical Cache Coherent Protocol

As the number of processors in distributed-memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processo...

Full description

Bibliographic Details
Main Author: Wallach, Deborah A.
Language:en_US
Published: 2004
Online Access:http://hdl.handle.net/1721.1/7088
_version_ 1826205026991734784
author Wallach, Deborah A.
author_facet Wallach, Deborah A.
author_sort Wallach, Deborah A.
collection MIT
description As the number of processors in distributed-memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processors. PHD eliminates bandwidth problems by using a scalable network, decreases hot-spots by not relying on a single point to distribute blocks, and uses a scalable amount of space for its directories. PHD provides a shared-memory model by synthesizing a global shared memory from the local memories of processors. PHD supports sequentially consistent read, write, and test- and-set operations. This thesis also introduces a method of describing locality for hierarchical protocols and employs this method in the derivation of an abstract model of the protocol behavior. An embedded model, based on the work of Johnson[ISCA19], describes the protocol behavior when mapped to a k-ary n-cube. The thesis uses these two models to study the average height in the hierarchy that operations reach, the longest path messages travel, the number of messages that operations generate, the inter-transaction issue time, and the protocol overhead for different locality parameters, degrees of multithreading, and machine sizes. We determine that multithreading is only useful for approximately two to four threads; any additional interleaving does not decrease the overall latency. For small machines and high locality applications, this limitation is due mainly to the length of the running threads. For large machines with medium to low locality, this limitation is due mainly to the protocol overhead being too large. Our study using the embedded model shows that in situations where the run length between references to shared memory is at least an order of magnitude longer than the time to process a single state transition in the protocol, applications exhibit good performance. If separate controllers for processing protocol requests are included, the protocol scales to 32k processor machines as long as the application exhibits hierarchical locality: at least 22% of the global references must be able to be satisfied locally; at most 35% of the global references are allowed to reach the top level of the hierarchy.
first_indexed 2024-09-23T13:05:35Z
id mit-1721.1/7088
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:05:35Z
publishDate 2004
record_format dspace
spelling mit-1721.1/70882019-04-12T08:33:53Z A Hierarchical Cache Coherent Protocol Wallach, Deborah A. As the number of processors in distributed-memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processors. PHD eliminates bandwidth problems by using a scalable network, decreases hot-spots by not relying on a single point to distribute blocks, and uses a scalable amount of space for its directories. PHD provides a shared-memory model by synthesizing a global shared memory from the local memories of processors. PHD supports sequentially consistent read, write, and test- and-set operations. This thesis also introduces a method of describing locality for hierarchical protocols and employs this method in the derivation of an abstract model of the protocol behavior. An embedded model, based on the work of Johnson[ISCA19], describes the protocol behavior when mapped to a k-ary n-cube. The thesis uses these two models to study the average height in the hierarchy that operations reach, the longest path messages travel, the number of messages that operations generate, the inter-transaction issue time, and the protocol overhead for different locality parameters, degrees of multithreading, and machine sizes. We determine that multithreading is only useful for approximately two to four threads; any additional interleaving does not decrease the overall latency. For small machines and high locality applications, this limitation is due mainly to the length of the running threads. For large machines with medium to low locality, this limitation is due mainly to the protocol overhead being too large. Our study using the embedded model shows that in situations where the run length between references to shared memory is at least an order of magnitude longer than the time to process a single state transition in the protocol, applications exhibit good performance. If separate controllers for processing protocol requests are included, the protocol scales to 32k processor machines as long as the application exhibits hierarchical locality: at least 22% of the global references must be able to be satisfied locally; at most 35% of the global references are allowed to reach the top level of the hierarchy. 2004-10-20T20:29:26Z 2004-10-20T20:29:26Z 1992-09-01 AITR-1645 http://hdl.handle.net/1721.1/7088 en_US AITR-1645 3979950 bytes 3395110 bytes application/postscript application/pdf application/postscript application/pdf
spellingShingle Wallach, Deborah A.
A Hierarchical Cache Coherent Protocol
title A Hierarchical Cache Coherent Protocol
title_full A Hierarchical Cache Coherent Protocol
title_fullStr A Hierarchical Cache Coherent Protocol
title_full_unstemmed A Hierarchical Cache Coherent Protocol
title_short A Hierarchical Cache Coherent Protocol
title_sort hierarchical cache coherent protocol
url http://hdl.handle.net/1721.1/7088
work_keys_str_mv AT wallachdeboraha ahierarchicalcachecoherentprotocol
AT wallachdeboraha hierarchicalcachecoherentprotocol