OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coheren...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2018
|
Online Access: | http://hdl.handle.net/1721.1/115325 https://orcid.org/0000-0001-8253-7714 |
_version_ | 1811093963430428672 |
---|---|
author | Kurian, George Shi, Qingchuan Devadas, Srinivas Khan, Omer |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kurian, George Shi, Qingchuan Devadas, Srinivas Khan, Omer |
author_sort | Kurian, George |
collection | MIT |
description | Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coherence protocol for consistency. Recently, remote access, a mechanism that reduces energy and latency through word-level access to data anywhere on chip has been proposed. Remote access does not replicate data in the private caches, and thereby removes the need for expensive cache line invalidations or updates. Researchers have implemented remote access as an auxiliary mechanism in cache coherence to improve efficiency. Unfortunately, stronger memory models, such as Intel's TSO, require strict ordering among the loads and stores. This introduces serialization penalties for data classified to be accessed remotely, which hampers each core's ability to optimally exploit memory level parallelism. In this paper we propose a novel timestamp-based scheme to detect memory consistency violations. The proposed scheme enables remote accesses to be issued and completed in parallel while continuously detecting whether any ordering violations have occurred, and rolling back the pipeline state (if needed). We implement our scheme for the locality-aware cache coherence protocol that uses remote access as an auxiliary mechanism for efficient data access. Our evaluation using a 64-core multicore processor with out-of-order speculative cores shows that the proposed technique improves completion time by 26% and energy by 20% over a state-of-the-art cache management scheme. |
first_indexed | 2024-09-23T15:53:27Z |
format | Article |
id | mit-1721.1/115325 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T15:53:27Z |
publishDate | 2018 |
publisher | Institute of Electrical and Electronics Engineers (IEEE) |
record_format | dspace |
spelling | mit-1721.1/1153252022-10-02T04:53:35Z OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access Kurian, George Shi, Qingchuan Devadas, Srinivas Khan, Omer Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kurian, George Devadas, Srinivas Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coherence protocol for consistency. Recently, remote access, a mechanism that reduces energy and latency through word-level access to data anywhere on chip has been proposed. Remote access does not replicate data in the private caches, and thereby removes the need for expensive cache line invalidations or updates. Researchers have implemented remote access as an auxiliary mechanism in cache coherence to improve efficiency. Unfortunately, stronger memory models, such as Intel's TSO, require strict ordering among the loads and stores. This introduces serialization penalties for data classified to be accessed remotely, which hampers each core's ability to optimally exploit memory level parallelism. In this paper we propose a novel timestamp-based scheme to detect memory consistency violations. The proposed scheme enables remote accesses to be issued and completed in parallel while continuously detecting whether any ordering violations have occurred, and rolling back the pipeline state (if needed). We implement our scheme for the locality-aware cache coherence protocol that uses remote access as an auxiliary mechanism for efficient data access. Our evaluation using a 64-core multicore processor with out-of-order speculative cores shows that the proposed technique improves completion time by 26% and energy by 20% over a state-of-the-art cache management scheme. National Science Foundation (U.S.) (Grant CCF-1452327) 2018-05-11T16:55:28Z 2018-05-11T16:55:28Z 2016-03 2015-10 Article http://purl.org/eprint/type/ConferencePaper 978-1-4673-9524-3 http://hdl.handle.net/1721.1/115325 Kurian, George, et al. "OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols Involving Invalidation-Free Data Access."2015 International Conference on Parallel Architecture and Compilation (PACT), 18-25 October, 2015, San Francisco, California, IEEE, 2015, pp. 392–405. https://orcid.org/0000-0001-8253-7714 en_US http://dx.doi.org/10.1109/PACT.2015.45 2015 International Conference on Parallel Architecture and Compilation (PACT) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) MIT Web Domain |
spellingShingle | Kurian, George Shi, Qingchuan Devadas, Srinivas Khan, Omer OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title | OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title_full | OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title_fullStr | OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title_full_unstemmed | OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title_short | OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access |
title_sort | osprey implementation of memory consistency models for cache coherence protocols involving invalidation free data access |
url | http://hdl.handle.net/1721.1/115325 https://orcid.org/0000-0001-8253-7714 |
work_keys_str_mv | AT kuriangeorge ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess AT shiqingchuan ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess AT devadassrinivas ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess AT khanomer ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess |