Amalgamated Lock-Elision

Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path...

Full description

Bibliographic Details
Main Authors: Afek, Yehuda, Matveev, Alexander, Moll Thomae, Oscar R.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Springer Nature America, Inc 2020
Online Access:https://hdl.handle.net/1721.1/124906
_version_ 1826212671712657408
author Afek, Yehuda
Matveev, Alexander
Moll Thomae, Oscar R.
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Afek, Yehuda
Matveev, Alexander
Moll Thomae, Oscar R.
author_sort Afek, Yehuda
collection MIT
description Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all fast-path hardware transactions abort and wait for the completion of the fallback. Software lock elision has no such limitation, but the overheads incurred are simply too high. We propose amalgamated lock-elision (ALE), a novel lock-elision algorithm that provides hardware-software concurrency and efficiency: the fallback-path executes concurrently with fast-path hardware transactions, while the common-path fast-path reads incur no overheads and proceed without any instrumentation. The key idea in ALE is to use a sequence of fine-grained locks in the fallback-path to detect conflicts with the fast-path, and at the same time reduce the costs of these locks by executing the fallback-path as a series segments, where each segment is a dynamic length short hardware transaction. We implemented ALE into GCC and tested the new system on Intel Haswell 16-way chip that provides hardware transactions. We benchmarked linked-lists, hash-tables and red-black trees, as well as converting KyotoCacheDB to use ALE in GCC, and all show that ALE significantly outperforms HLE. Keywords: Multicore; Hardware Lock Elision; Hardware Transactional Memory; Algorithms
first_indexed 2024-09-23T15:33:46Z
format Article
id mit-1721.1/124906
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:33:46Z
publishDate 2020
publisher Springer Nature America, Inc
record_format dspace
spelling mit-1721.1/1249062022-09-29T14:55:20Z Amalgamated Lock-Elision Afek, Yehuda Matveev, Alexander Moll Thomae, Oscar R. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all fast-path hardware transactions abort and wait for the completion of the fallback. Software lock elision has no such limitation, but the overheads incurred are simply too high. We propose amalgamated lock-elision (ALE), a novel lock-elision algorithm that provides hardware-software concurrency and efficiency: the fallback-path executes concurrently with fast-path hardware transactions, while the common-path fast-path reads incur no overheads and proceed without any instrumentation. The key idea in ALE is to use a sequence of fine-grained locks in the fallback-path to detect conflicts with the fast-path, and at the same time reduce the costs of these locks by executing the fallback-path as a series segments, where each segment is a dynamic length short hardware transaction. We implemented ALE into GCC and tested the new system on Intel Haswell 16-way chip that provides hardware transactions. We benchmarked linked-lists, hash-tables and red-black trees, as well as converting KyotoCacheDB to use ALE in GCC, and all show that ALE significantly outperforms HLE. Keywords: Multicore; Hardware Lock Elision; Hardware Transactional Memory; Algorithms Israel Science Foundation (Grant 1386/11) National Science Foundation (U.S.) (Grant CCF-1217921) National Science Foundation (U.S.) (Grant CCF-1301926) National Science Foundation (U.S.) (Grant IIS-1447786) United States. Department of Energy (Grant ER26116/DE-SC0008923) 2020-04-28T18:58:05Z 2020-04-28T18:58:05Z 2015-11 2019-07-03T12:30:55Z Article http://purl.org/eprint/type/ConferencePaper 978-3-662-48652-8 978-3-662-48653-5 https://hdl.handle.net/1721.1/124906 Afek, Yehuda, et al. “Amalgamated Lock-Elision.” Distributed Computing, edited by Yoram Moses, vol. 9363, Springer Berlin Heidelberg (2015): pp. 309–24. en http://dx.doi.org/10.1007/978-3-662-48653-5_21 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Springer Nature America, Inc MIT web domain
spellingShingle Afek, Yehuda
Matveev, Alexander
Moll Thomae, Oscar R.
Amalgamated Lock-Elision
title Amalgamated Lock-Elision
title_full Amalgamated Lock-Elision
title_fullStr Amalgamated Lock-Elision
title_full_unstemmed Amalgamated Lock-Elision
title_short Amalgamated Lock-Elision
title_sort amalgamated lock elision
url https://hdl.handle.net/1721.1/124906
work_keys_str_mv AT afekyehuda amalgamatedlockelision
AT matveevalexander amalgamatedlockelision
AT mollthomaeoscarr amalgamatedlockelision