The Locality-First Strategy for Developing Efficient Multicore Algorithm

To scale applications on multicores up to bigger problems, software systems must be optimized both for parallelism to take full advantage of the multiple cores and for locality to exploit the memory system for cache-friendliness. Parallelization alone does not suffice to reach peak performance due t...

Full description

Bibliographic Details
Main Author:	Xu, Helen Jiang
Other Authors:	Leiserson, Charles E.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/143200 https://orcid.org/0000-0003-2232-3305

_version_	1811077386483007488
author	Xu, Helen Jiang
author2	Leiserson, Charles E.
author_facet	Leiserson, Charles E. Xu, Helen Jiang
author_sort	Xu, Helen Jiang
collection	MIT
description	To scale applications on multicores up to bigger problems, software systems must be optimized both for parallelism to take full advantage of the multiple cores and for locality to exploit the memory system for cache-friendliness. Parallelization alone does not suffice to reach peak performance due to the processor-memory gap: the increasing divergence of processor and memory speeds. Locality and parallelism are difficult to optimize for independently — and even more challenging to combine — because they tend to conflict with each other. I advocate that algorithm developers employ a locality-first strategy for developing efficient parallel and cache-friendly algorithms for multicores. That is, they should first understand and exploit locality as much as possible before introducing parallelism. I argue that an algorithm developer can achieve high-performing code more easily with the locality-first strategy than with either a parallelism-first strategy or a strategy of trying to optimize both simultaneously. I present ten artifacts that leverage the locality-first strategy to create fast multicore algorithms that are simple to describe and implement. For example, locality-first data structure design in graph processing achieves about 2× speedup over the state of the art. Additionally, I prove mathematically that multicore cache-replacement algorithms that take advantage of locality outperform all other online algorithms. The other eight artifacts make similar contributions in their respective domains. Together, these artifacts demonstrate that the locality-first strategy provides an effective roadmap for algorithm developers to design and implement theoretically and practically efficient multicore code.
first_indexed	2024-09-23T10:42:11Z
format	Thesis
id	mit-1721.1/143200
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:42:11Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1432002022-06-16T03:40:32Z The Locality-First Strategy for Developing Efficient Multicore Algorithm Xu, Helen Jiang Leiserson, Charles E. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science To scale applications on multicores up to bigger problems, software systems must be optimized both for parallelism to take full advantage of the multiple cores and for locality to exploit the memory system for cache-friendliness. Parallelization alone does not suffice to reach peak performance due to the processor-memory gap: the increasing divergence of processor and memory speeds. Locality and parallelism are difficult to optimize for independently — and even more challenging to combine — because they tend to conflict with each other. I advocate that algorithm developers employ a locality-first strategy for developing efficient parallel and cache-friendly algorithms for multicores. That is, they should first understand and exploit locality as much as possible before introducing parallelism. I argue that an algorithm developer can achieve high-performing code more easily with the locality-first strategy than with either a parallelism-first strategy or a strategy of trying to optimize both simultaneously. I present ten artifacts that leverage the locality-first strategy to create fast multicore algorithms that are simple to describe and implement. For example, locality-first data structure design in graph processing achieves about 2× speedup over the state of the art. Additionally, I prove mathematically that multicore cache-replacement algorithms that take advantage of locality outperform all other online algorithms. The other eight artifacts make similar contributions in their respective domains. Together, these artifacts demonstrate that the locality-first strategy provides an effective roadmap for algorithm developers to design and implement theoretically and practically efficient multicore code. Ph.D. 2022-06-15T13:03:04Z 2022-06-15T13:03:04Z 2022-02 2022-03-04T20:47:45.889Z Thesis https://hdl.handle.net/1721.1/143200 https://orcid.org/0000-0003-2232-3305 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Xu, Helen Jiang The Locality-First Strategy for Developing Efficient Multicore Algorithm
title	The Locality-First Strategy for Developing Efficient Multicore Algorithm
title_full	The Locality-First Strategy for Developing Efficient Multicore Algorithm
title_fullStr	The Locality-First Strategy for Developing Efficient Multicore Algorithm
title_full_unstemmed	The Locality-First Strategy for Developing Efficient Multicore Algorithm
title_short	The Locality-First Strategy for Developing Efficient Multicore Algorithm
title_sort	locality first strategy for developing efficient multicore algorithm
url	https://hdl.handle.net/1721.1/143200 https://orcid.org/0000-0003-2232-3305
work_keys_str_mv	AT xuhelenjiang thelocalityfirststrategyfordevelopingefficientmulticorealgorithm AT xuhelenjiang localityfirststrategyfordevelopingefficientmulticorealgorithm

The Locality-First Strategy for Developing Efficient Multicore Algorithm

Similar Items