OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access

Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coheren...

Full description

Bibliographic Details
Main Authors: Kurian, George, Shi, Qingchuan, Devadas, Srinivas, Khan, Omer
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers (IEEE) 2018
Online Access:http://hdl.handle.net/1721.1/115325
https://orcid.org/0000-0001-8253-7714
_version_ 1811093963430428672
author Kurian, George
Shi, Qingchuan
Devadas, Srinivas
Khan, Omer
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Kurian, George
Shi, Qingchuan
Devadas, Srinivas
Khan, Omer
author_sort Kurian, George
collection MIT
description Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coherence protocol for consistency. Recently, remote access, a mechanism that reduces energy and latency through word-level access to data anywhere on chip has been proposed. Remote access does not replicate data in the private caches, and thereby removes the need for expensive cache line invalidations or updates. Researchers have implemented remote access as an auxiliary mechanism in cache coherence to improve efficiency. Unfortunately, stronger memory models, such as Intel's TSO, require strict ordering among the loads and stores. This introduces serialization penalties for data classified to be accessed remotely, which hampers each core's ability to optimally exploit memory level parallelism. In this paper we propose a novel timestamp-based scheme to detect memory consistency violations. The proposed scheme enables remote accesses to be issued and completed in parallel while continuously detecting whether any ordering violations have occurred, and rolling back the pipeline state (if needed). We implement our scheme for the locality-aware cache coherence protocol that uses remote access as an auxiliary mechanism for efficient data access. Our evaluation using a 64-core multicore processor with out-of-order speculative cores shows that the proposed technique improves completion time by 26% and energy by 20% over a state-of-the-art cache management scheme.
first_indexed 2024-09-23T15:53:27Z
format Article
id mit-1721.1/115325
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:53:27Z
publishDate 2018
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1153252022-10-02T04:53:35Z OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access Kurian, George Shi, Qingchuan Devadas, Srinivas Khan, Omer Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kurian, George Devadas, Srinivas Data access in modern processors contributes significantly to the overall performance and energy consumption. Traditionally, data is distributed among the cores through an on-chip cache hierarchy, and each producer/consumer accesses data through its private level-1 cache relying on the cache coherence protocol for consistency. Recently, remote access, a mechanism that reduces energy and latency through word-level access to data anywhere on chip has been proposed. Remote access does not replicate data in the private caches, and thereby removes the need for expensive cache line invalidations or updates. Researchers have implemented remote access as an auxiliary mechanism in cache coherence to improve efficiency. Unfortunately, stronger memory models, such as Intel's TSO, require strict ordering among the loads and stores. This introduces serialization penalties for data classified to be accessed remotely, which hampers each core's ability to optimally exploit memory level parallelism. In this paper we propose a novel timestamp-based scheme to detect memory consistency violations. The proposed scheme enables remote accesses to be issued and completed in parallel while continuously detecting whether any ordering violations have occurred, and rolling back the pipeline state (if needed). We implement our scheme for the locality-aware cache coherence protocol that uses remote access as an auxiliary mechanism for efficient data access. Our evaluation using a 64-core multicore processor with out-of-order speculative cores shows that the proposed technique improves completion time by 26% and energy by 20% over a state-of-the-art cache management scheme. National Science Foundation (U.S.) (Grant CCF-1452327) 2018-05-11T16:55:28Z 2018-05-11T16:55:28Z 2016-03 2015-10 Article http://purl.org/eprint/type/ConferencePaper 978-1-4673-9524-3 http://hdl.handle.net/1721.1/115325 Kurian, George, et al. "OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols Involving Invalidation-Free Data Access."2015 International Conference on Parallel Architecture and Compilation (PACT), 18-25 October, 2015, San Francisco, California, IEEE, 2015, pp. 392–405. https://orcid.org/0000-0001-8253-7714 en_US http://dx.doi.org/10.1109/PACT.2015.45 2015 International Conference on Parallel Architecture and Compilation (PACT) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) MIT Web Domain
spellingShingle Kurian, George
Shi, Qingchuan
Devadas, Srinivas
Khan, Omer
OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title_full OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title_fullStr OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title_full_unstemmed OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title_short OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access
title_sort osprey implementation of memory consistency models for cache coherence protocols involving invalidation free data access
url http://hdl.handle.net/1721.1/115325
https://orcid.org/0000-0001-8253-7714
work_keys_str_mv AT kuriangeorge ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess
AT shiqingchuan ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess
AT devadassrinivas ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess
AT khanomer ospreyimplementationofmemoryconsistencymodelsforcachecoherenceprotocolsinvolvinginvalidationfreedataaccess