Amalgamated Lock-Elision
Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Springer Nature America, Inc
2020
|
Online Access: | https://hdl.handle.net/1721.1/124906 |
_version_ | 1826212671712657408 |
---|---|
author | Afek, Yehuda Matveev, Alexander Moll Thomae, Oscar R. |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Afek, Yehuda Matveev, Alexander Moll Thomae, Oscar R. |
author_sort | Afek, Yehuda |
collection | MIT |
description | Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all fast-path hardware transactions abort and wait for the completion of the fallback. Software lock elision has no such limitation, but the overheads incurred are simply too high. We propose amalgamated lock-elision (ALE), a novel lock-elision algorithm that provides hardware-software concurrency and efficiency: the fallback-path executes concurrently with fast-path hardware transactions, while the common-path fast-path reads incur no overheads and proceed without any instrumentation. The key idea in ALE is to use a sequence of fine-grained locks in the fallback-path to detect conflicts with the fast-path, and at the same time reduce the costs of these locks by executing the fallback-path as a series segments, where each segment is a dynamic length short hardware transaction. We implemented ALE into GCC and tested the new system on Intel Haswell 16-way chip that provides hardware transactions. We benchmarked linked-lists, hash-tables and red-black trees, as well as converting KyotoCacheDB to use ALE in GCC, and all show that ALE significantly outperforms HLE. Keywords: Multicore; Hardware Lock Elision; Hardware Transactional Memory; Algorithms |
first_indexed | 2024-09-23T15:33:46Z |
format | Article |
id | mit-1721.1/124906 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T15:33:46Z |
publishDate | 2020 |
publisher | Springer Nature America, Inc |
record_format | dspace |
spelling | mit-1721.1/1249062022-09-29T14:55:20Z Amalgamated Lock-Elision Afek, Yehuda Matveev, Alexander Moll Thomae, Oscar R. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all fast-path hardware transactions abort and wait for the completion of the fallback. Software lock elision has no such limitation, but the overheads incurred are simply too high. We propose amalgamated lock-elision (ALE), a novel lock-elision algorithm that provides hardware-software concurrency and efficiency: the fallback-path executes concurrently with fast-path hardware transactions, while the common-path fast-path reads incur no overheads and proceed without any instrumentation. The key idea in ALE is to use a sequence of fine-grained locks in the fallback-path to detect conflicts with the fast-path, and at the same time reduce the costs of these locks by executing the fallback-path as a series segments, where each segment is a dynamic length short hardware transaction. We implemented ALE into GCC and tested the new system on Intel Haswell 16-way chip that provides hardware transactions. We benchmarked linked-lists, hash-tables and red-black trees, as well as converting KyotoCacheDB to use ALE in GCC, and all show that ALE significantly outperforms HLE. Keywords: Multicore; Hardware Lock Elision; Hardware Transactional Memory; Algorithms Israel Science Foundation (Grant 1386/11) National Science Foundation (U.S.) (Grant CCF-1217921) National Science Foundation (U.S.) (Grant CCF-1301926) National Science Foundation (U.S.) (Grant IIS-1447786) United States. Department of Energy (Grant ER26116/DE-SC0008923) 2020-04-28T18:58:05Z 2020-04-28T18:58:05Z 2015-11 2019-07-03T12:30:55Z Article http://purl.org/eprint/type/ConferencePaper 978-3-662-48652-8 978-3-662-48653-5 https://hdl.handle.net/1721.1/124906 Afek, Yehuda, et al. “Amalgamated Lock-Elision.” Distributed Computing, edited by Yoram Moses, vol. 9363, Springer Berlin Heidelberg (2015): pp. 309–24. en http://dx.doi.org/10.1007/978-3-662-48653-5_21 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Springer Nature America, Inc MIT web domain |
spellingShingle | Afek, Yehuda Matveev, Alexander Moll Thomae, Oscar R. Amalgamated Lock-Elision |
title | Amalgamated Lock-Elision |
title_full | Amalgamated Lock-Elision |
title_fullStr | Amalgamated Lock-Elision |
title_full_unstemmed | Amalgamated Lock-Elision |
title_short | Amalgamated Lock-Elision |
title_sort | amalgamated lock elision |
url | https://hdl.handle.net/1721.1/124906 |
work_keys_str_mv | AT afekyehuda amalgamatedlockelision AT matveevalexander amalgamatedlockelision AT mollthomaeoscarr amalgamatedlockelision |