Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches

Future CMPs will have more cores and greater onchip cache capacity. The on-chip cache can either be divided into separate private L2 caches for each core, or treated as a large shared L2 cache. Private caches provide low hit latency but low capacity, while shared caches have higher hit latencies but...

Full description

Bibliographic Details
Main Authors:	Zhang, MIchael, Asanovic, Krste
Other Authors:	Computer Architecture
Language:	en_US
Published:	2005
Online Access:	http://hdl.handle.net/1721.1/30574

_version_	1826202955046453248
author	Zhang, MIchael Asanovic, Krste
author2	Computer Architecture
author_facet	Computer Architecture Zhang, MIchael Asanovic, Krste
author_sort	Zhang, MIchael
collection	MIT
description	Future CMPs will have more cores and greater onchip cache capacity. The on-chip cache can either be divided into separate private L2 caches for each core, or treated as a large shared L2 cache. Private caches provide low hit latency but low capacity, while shared caches have higher hit latencies but greater capacity. Victim replication was previously introduced as a way of reducing the average hit latency of a shared cache by allowing a processor to make a replica of a primary cache victim in its local slice of the global L2 cache. Although victim replication performs well on multithreaded and single-threaded codes, it performs worse than the private scheme for multiprogrammed workloads where there is little sharing between the different programs running at the same time. In this paper, we propose victim migration, which improves on victim replication by adding an additional set of migration tags on each node which are used to implement an exclusive cache policy for replicas. When a replica has been created on a remote node, it is not also cached on the home node, but only recorded in the migration tags. This frees up space on the home node to store shared global lines or replicas for the local processor. We show that victim migration performs better than private, shared, and victim replication schemes across a range of single threaded, multithreaded, and multiprogrammed workloads, while using less area than a private cache design. Victim migration provides a reduction in average memory access latency of up to 10% over victim replication.
first_indexed	2024-09-23T12:28:27Z
id	mit-1721.1/30574
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T12:28:27Z
publishDate	2005
record_format	dspace
spelling	mit-1721.1/305742019-04-12T08:26:34Z Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches Zhang, MIchael Asanovic, Krste Computer Architecture Future CMPs will have more cores and greater onchip cache capacity. The on-chip cache can either be divided into separate private L2 caches for each core, or treated as a large shared L2 cache. Private caches provide low hit latency but low capacity, while shared caches have higher hit latencies but greater capacity. Victim replication was previously introduced as a way of reducing the average hit latency of a shared cache by allowing a processor to make a replica of a primary cache victim in its local slice of the global L2 cache. Although victim replication performs well on multithreaded and single-threaded codes, it performs worse than the private scheme for multiprogrammed workloads where there is little sharing between the different programs running at the same time. In this paper, we propose victim migration, which improves on victim replication by adding an additional set of migration tags on each node which are used to implement an exclusive cache policy for replicas. When a replica has been created on a remote node, it is not also cached on the home node, but only recorded in the migration tags. This frees up space on the home node to store shared global lines or replicas for the local processor. We show that victim migration performs better than private, shared, and victim replication schemes across a range of single threaded, multithreaded, and multiprogrammed workloads, while using less area than a private cache design. Victim migration provides a reduction in average memory access latency of up to 10% over victim replication. 2005-12-22T02:37:29Z 2005-12-22T02:37:29Z 2005-10-10 MIT-CSAIL-TR-2005-064 MIT-LCS-TR-1006 http://hdl.handle.net/1721.1/30574 en_US Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory 17 p. 18487877 bytes 796263 bytes application/postscript application/pdf application/postscript application/pdf
spellingShingle	Zhang, MIchael Asanovic, Krste Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title	Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title_full	Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title_fullStr	Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title_full_unstemmed	Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title_short	Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches
title_sort	victim migration dynamically adapting between private and shared cmp caches
url	http://hdl.handle.net/1721.1/30574
work_keys_str_mv	AT zhangmichael victimmigrationdynamicallyadaptingbetweenprivateandsharedcmpcaches AT asanovickrste victimmigrationdynamicallyadaptingbetweenprivateandsharedcmpcaches

Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches

Similar Items