A Software Approach to Unifying Multicore Caches

Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lea...

Full description

Bibliographic Details
Main Authors:	Boyd-Wickizer, Silas, Kaashoek, M. Frans, Morris, Robert, Zeldovich, Nickolai
Other Authors:	Robert Morris
Language:	en-US
Published:	2011
Online Access:	http://hdl.handle.net/1721.1/64698

_version_	1826203689564504064
author	Boyd-Wickizer, Silas Kaashoek, M. Frans Morris, Robert Zeldovich, Nickolai
author2	Robert Morris
author_facet	Robert Morris Boyd-Wickizer, Silas Kaashoek, M. Frans Morris, Robert Zeldovich, Nickolai
author_sort	Boyd-Wickizer, Silas
collection	MIT
description	Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core. A new approach called software cache unification (SCU) addresses these challenges for applications that would be better served by a large shared cache. SCU chooses the on-chip cache in which to cache each item of data. As an application thread reads data items, SCU moves the thread to the core whose on-chip cache contains each item. This allows the thread to read the data quickly if it is already on-chip; if it is not, moving the thread causes the data to be loaded into the chosen on-chip cache. A new file cache for Linux, called MFC, uses SCU to improve performance of file-intensive applications, such as Unix file utilities. An evaluation on a 16-core AMD Opteron machine shows that MFC improves the throughput of file utilities by a factor of 1.6. Experiments with a platform that emulates future machines with less DRAM throughput per core shows that MFC will provide benefit to a growing range of applications.
first_indexed	2024-09-23T12:41:29Z
id	mit-1721.1/64698
institution	Massachusetts Institute of Technology
language	en-US
last_indexed	2024-09-23T12:41:29Z
publishDate	2011
record_format	dspace
spelling	mit-1721.1/646982019-04-12T13:27:06Z A Software Approach to Unifying Multicore Caches Boyd-Wickizer, Silas Kaashoek, M. Frans Morris, Robert Zeldovich, Nickolai Robert Morris Parallel and Distributed Operating Systems Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core. A new approach called software cache unification (SCU) addresses these challenges for applications that would be better served by a large shared cache. SCU chooses the on-chip cache in which to cache each item of data. As an application thread reads data items, SCU moves the thread to the core whose on-chip cache contains each item. This allows the thread to read the data quickly if it is already on-chip; if it is not, moving the thread causes the data to be loaded into the chosen on-chip cache. A new file cache for Linux, called MFC, uses SCU to improve performance of file-intensive applications, such as Unix file utilities. An evaluation on a 16-core AMD Opteron machine shows that MFC improves the throughput of file utilities by a factor of 1.6. Experiments with a platform that emulates future machines with less DRAM throughput per core shows that MFC will provide benefit to a growing range of applications. This material is based upon work supported by the National Science Foundation under grant number 0915164. 2011-06-28T21:45:22Z 2011-06-28T21:45:22Z 2011-06-28 http://hdl.handle.net/1721.1/64698 en-US MIT-CSAIL-TR-2011-032 Creative Commons Attribution 3.0 Unported http://creativecommons.org/licenses/by/3.0/ 13 p. application/pdf
spellingShingle	Boyd-Wickizer, Silas Kaashoek, M. Frans Morris, Robert Zeldovich, Nickolai A Software Approach to Unifying Multicore Caches
title	A Software Approach to Unifying Multicore Caches
title_full	A Software Approach to Unifying Multicore Caches
title_fullStr	A Software Approach to Unifying Multicore Caches
title_full_unstemmed	A Software Approach to Unifying Multicore Caches
title_short	A Software Approach to Unifying Multicore Caches
title_sort	software approach to unifying multicore caches
url	http://hdl.handle.net/1721.1/64698
work_keys_str_mv	AT boydwickizersilas asoftwareapproachtounifyingmulticorecaches AT kaashoekmfrans asoftwareapproachtounifyingmulticorecaches AT morrisrobert asoftwareapproachtounifyingmulticorecaches AT zeldovichnickolai asoftwareapproachtounifyingmulticorecaches AT boydwickizersilas softwareapproachtounifyingmulticorecaches AT kaashoekmfrans softwareapproachtounifyingmulticorecaches AT morrisrobert softwareapproachtounifyingmulticorecaches AT zeldovichnickolai softwareapproachtounifyingmulticorecaches

A Software Approach to Unifying Multicore Caches

Similar Items