A Software Approach to Unifying Multicore Caches

Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lea...

Full description

Bibliographic Details
Main Authors: Boyd-Wickizer, Silas, Kaashoek, M. Frans, Morris, Robert, Zeldovich, Nickolai
Other Authors: Robert Morris
Language:en-US
Published: 2011
Online Access:http://hdl.handle.net/1721.1/64698
_version_ 1826203689564504064
author Boyd-Wickizer, Silas
Kaashoek, M. Frans
Morris, Robert
Zeldovich, Nickolai
author2 Robert Morris
author_facet Robert Morris
Boyd-Wickizer, Silas
Kaashoek, M. Frans
Morris, Robert
Zeldovich, Nickolai
author_sort Boyd-Wickizer, Silas
collection MIT
description Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core. A new approach called software cache unification (SCU) addresses these challenges for applications that would be better served by a large shared cache. SCU chooses the on-chip cache in which to cache each item of data. As an application thread reads data items, SCU moves the thread to the core whose on-chip cache contains each item. This allows the thread to read the data quickly if it is already on-chip; if it is not, moving the thread causes the data to be loaded into the chosen on-chip cache. A new file cache for Linux, called MFC, uses SCU to improve performance of file-intensive applications, such as Unix file utilities. An evaluation on a 16-core AMD Opteron machine shows that MFC improves the throughput of file utilities by a factor of 1.6. Experiments with a platform that emulates future machines with less DRAM throughput per core shows that MFC will provide benefit to a growing range of applications.
first_indexed 2024-09-23T12:41:29Z
id mit-1721.1/64698
institution Massachusetts Institute of Technology
language en-US
last_indexed 2024-09-23T12:41:29Z
publishDate 2011
record_format dspace
spelling mit-1721.1/646982019-04-12T13:27:06Z A Software Approach to Unifying Multicore Caches Boyd-Wickizer, Silas Kaashoek, M. Frans Morris, Robert Zeldovich, Nickolai Robert Morris Parallel and Distributed Operating Systems Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core. A new approach called software cache unification (SCU) addresses these challenges for applications that would be better served by a large shared cache. SCU chooses the on-chip cache in which to cache each item of data. As an application thread reads data items, SCU moves the thread to the core whose on-chip cache contains each item. This allows the thread to read the data quickly if it is already on-chip; if it is not, moving the thread causes the data to be loaded into the chosen on-chip cache. A new file cache for Linux, called MFC, uses SCU to improve performance of file-intensive applications, such as Unix file utilities. An evaluation on a 16-core AMD Opteron machine shows that MFC improves the throughput of file utilities by a factor of 1.6. Experiments with a platform that emulates future machines with less DRAM throughput per core shows that MFC will provide benefit to a growing range of applications. This material is based upon work supported by the National Science Foundation under grant number 0915164. 2011-06-28T21:45:22Z 2011-06-28T21:45:22Z 2011-06-28 http://hdl.handle.net/1721.1/64698 en-US MIT-CSAIL-TR-2011-032 Creative Commons Attribution 3.0 Unported http://creativecommons.org/licenses/by/3.0/ 13 p. application/pdf
spellingShingle Boyd-Wickizer, Silas
Kaashoek, M. Frans
Morris, Robert
Zeldovich, Nickolai
A Software Approach to Unifying Multicore Caches
title A Software Approach to Unifying Multicore Caches
title_full A Software Approach to Unifying Multicore Caches
title_fullStr A Software Approach to Unifying Multicore Caches
title_full_unstemmed A Software Approach to Unifying Multicore Caches
title_short A Software Approach to Unifying Multicore Caches
title_sort software approach to unifying multicore caches
url http://hdl.handle.net/1721.1/64698
work_keys_str_mv AT boydwickizersilas asoftwareapproachtounifyingmulticorecaches
AT kaashoekmfrans asoftwareapproachtounifyingmulticorecaches
AT morrisrobert asoftwareapproachtounifyingmulticorecaches
AT zeldovichnickolai asoftwareapproachtounifyingmulticorecaches
AT boydwickizersilas softwareapproachtounifyingmulticorecaches
AT kaashoekmfrans softwareapproachtounifyingmulticorecaches
AT morrisrobert softwareapproachtounifyingmulticorecaches
AT zeldovichnickolai softwareapproachtounifyingmulticorecaches