Tardis 2.0
Cache coherence scalability is a big challenge in shared memory systems. Traditional protocols do not scale due to the storage and traffic overhead of cache invalidation. Tardis, a recently proposed coherence protocol, removes cache invalidation using logical timestamps and achieves excellent scalab...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Association for Computing Machinery (ACM)
2018
|
Online Access: | http://hdl.handle.net/1721.1/115327 https://orcid.org/0000-0003-4317-3457 https://orcid.org/0000-0001-8253-7714 |
_version_ | 1826211115385749504 |
---|---|
author | Yu, Xiangyao Liu, Hongzhe Zou, Ethan Devadas, Srinivas |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Yu, Xiangyao Liu, Hongzhe Zou, Ethan Devadas, Srinivas |
author_sort | Yu, Xiangyao |
collection | MIT |
description | Cache coherence scalability is a big challenge in shared memory systems. Traditional protocols do not scale due to the storage and traffic overhead of cache invalidation. Tardis, a recently proposed coherence protocol, removes cache invalidation using logical timestamps and achieves excellent scalability. The original Tardis protocol, however, only supports the Sequential Consistency (SC) memory model, limiting its applicability. Tardis also incurs extra network traffic on some benchmarks due to renew messages, and has suboptimal performance when the program uses spinning to communicate between threads.
In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement the TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also describe modifications for Partial Store Order (PSO) and Release Consistency (RC). Finally, we propose optimizations for better leasing policies and to handle program spinning. On a set of benchmarks, optimized Tardis improves on a full-map directory protocol in the metrics of performance, storage and network traffic, while being simpler to implement. |
first_indexed | 2024-09-23T15:00:49Z |
format | Article |
id | mit-1721.1/115327 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T15:00:49Z |
publishDate | 2018 |
publisher | Association for Computing Machinery (ACM) |
record_format | dspace |
spelling | mit-1721.1/1153272022-10-01T23:59:03Z Tardis 2.0 Yu, Xiangyao Liu, Hongzhe Zou, Ethan Devadas, Srinivas Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Yu, Xiangyao Devadas, Srinivas Cache coherence scalability is a big challenge in shared memory systems. Traditional protocols do not scale due to the storage and traffic overhead of cache invalidation. Tardis, a recently proposed coherence protocol, removes cache invalidation using logical timestamps and achieves excellent scalability. The original Tardis protocol, however, only supports the Sequential Consistency (SC) memory model, limiting its applicability. Tardis also incurs extra network traffic on some benchmarks due to renew messages, and has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement the TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also describe modifications for Partial Store Order (PSO) and Release Consistency (RC). Finally, we propose optimizations for better leasing policies and to handle program spinning. On a set of benchmarks, optimized Tardis improves on a full-map directory protocol in the metrics of performance, storage and network traffic, while being simpler to implement. 2018-05-11T17:12:19Z 2018-05-11T17:12:19Z 2016-09 Article http://purl.org/eprint/type/ConferencePaper 978-1-4503-4121-9 http://hdl.handle.net/1721.1/115327 Yu, Xiangyao, et al. Tardis 2.0: "Optimized Time Traveling Coherence for Relaxed Consistency Models." PACT '16 Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 11-15 September, 2016, Haifa, Israel, ACM Press, 2016, pp. 261–74. https://orcid.org/0000-0003-4317-3457 https://orcid.org/0000-0001-8253-7714 en_US http://dx.doi.org/10.1145/2967938.2967942 Proceedings of the 2016 International Conference on Parallel Architectures and Compilation - PACT '16 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) MIT Web Domain |
spellingShingle | Yu, Xiangyao Liu, Hongzhe Zou, Ethan Devadas, Srinivas Tardis 2.0 |
title | Tardis 2.0 |
title_full | Tardis 2.0 |
title_fullStr | Tardis 2.0 |
title_full_unstemmed | Tardis 2.0 |
title_short | Tardis 2.0 |
title_sort | tardis 2 0 |
url | http://hdl.handle.net/1721.1/115327 https://orcid.org/0000-0003-4317-3457 https://orcid.org/0000-0001-8253-7714 |
work_keys_str_mv | AT yuxiangyao tardis20 AT liuhongzhe tardis20 AT zouethan tardis20 AT devadassrinivas tardis20 |