Scaling a file system to many cores using an operation log

© 2017 Copyright is held by the owner/author(s). It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, even for commutative operations like creating different files in the same directory, current file systems introduce cache-line co...

Full description

Bibliographic Details
Main Authors: Bhat, Srivatsa S., Eqbal, Rasha, Clements, Austin T., Kaashoek, M. Frans, Zeldovich, Nickolai
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Association for Computing Machinery (ACM) 2021
Online Access:https://hdl.handle.net/1721.1/137612
_version_ 1826217246876237824
author Bhat, Srivatsa S.
Eqbal, Rasha
Clements, Austin T.
Kaashoek, M. Frans
Zeldovich, Nickolai
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Bhat, Srivatsa S.
Eqbal, Rasha
Clements, Austin T.
Kaashoek, M. Frans
Zeldovich, Nickolai
author_sort Bhat, Srivatsa S.
collection MIT
description © 2017 Copyright is held by the owner/author(s). It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, even for commutative operations like creating different files in the same directory, current file systems introduce cache-line conflicts when updating an in-memory copy of the on-disk directory block, which limits scalability. ScaleFS is a novel file system design that decouples the in-memory file system from the on-disk file system using per-core operation logs. This design facilitates the use of highly concurrent data structures for the in-memory representation, which allows commutative operations to proceed without cache conflicts and hence scale perfectly. ScaleFS logs operations in a per-core log so that it can delay propagating updates to the disk representation (and the cache-line conflicts involved in doing so) until an fsync. The fsync call merges the per-core logs and applies the operations to disk. ScaleFS uses several techniques to perform the merge correctly while achieving good performance: timestamped linearization points to order updates without introducing cache-line conflicts, absorption of logged operations, and dependency tracking across operations. Experiments with a prototype of ScaleFS show that its implementation has no cache conflicts for 99% of test cases of commutative operations generated by Commuter, scales well on an 80-core machine, and provides on-disk performance that is comparable to that of Linux ext4.
first_indexed 2024-09-23T17:00:21Z
format Article
id mit-1721.1/137612
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T17:00:21Z
publishDate 2021
publisher Association for Computing Machinery (ACM)
record_format dspace
spelling mit-1721.1/1376122022-09-29T23:02:30Z Scaling a file system to many cores using an operation log Bhat, Srivatsa S. Eqbal, Rasha Clements, Austin T. Kaashoek, M. Frans Zeldovich, Nickolai Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory © 2017 Copyright is held by the owner/author(s). It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, even for commutative operations like creating different files in the same directory, current file systems introduce cache-line conflicts when updating an in-memory copy of the on-disk directory block, which limits scalability. ScaleFS is a novel file system design that decouples the in-memory file system from the on-disk file system using per-core operation logs. This design facilitates the use of highly concurrent data structures for the in-memory representation, which allows commutative operations to proceed without cache conflicts and hence scale perfectly. ScaleFS logs operations in a per-core log so that it can delay propagating updates to the disk representation (and the cache-line conflicts involved in doing so) until an fsync. The fsync call merges the per-core logs and applies the operations to disk. ScaleFS uses several techniques to perform the merge correctly while achieving good performance: timestamped linearization points to order updates without introducing cache-line conflicts, absorption of logged operations, and dependency tracking across operations. Experiments with a prototype of ScaleFS show that its implementation has no cache conflicts for 99% of test cases of commutative operations generated by Commuter, scales well on an 80-core machine, and provides on-disk performance that is comparable to that of Linux ext4. 2021-11-05T20:21:06Z 2021-11-05T20:21:06Z 2017 2019-06-03T17:08:23Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137612 Bhat, Srivatsa S., Eqbal, Rasha, Clements, Austin T., Kaashoek, M. Frans and Zeldovich, Nickolai. 2017. "Scaling a file system to many cores using an operation log." en 10.1145/3132747.3132779 Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Association for Computing Machinery (ACM) ACM
spellingShingle Bhat, Srivatsa S.
Eqbal, Rasha
Clements, Austin T.
Kaashoek, M. Frans
Zeldovich, Nickolai
Scaling a file system to many cores using an operation log
title Scaling a file system to many cores using an operation log
title_full Scaling a file system to many cores using an operation log
title_fullStr Scaling a file system to many cores using an operation log
title_full_unstemmed Scaling a file system to many cores using an operation log
title_short Scaling a file system to many cores using an operation log
title_sort scaling a file system to many cores using an operation log
url https://hdl.handle.net/1721.1/137612
work_keys_str_mv AT bhatsrivatsas scalingafilesystemtomanycoresusinganoperationlog
AT eqbalrasha scalingafilesystemtomanycoresusinganoperationlog
AT clementsaustint scalingafilesystemtomanycoresusinganoperationlog
AT kaashoekmfrans scalingafilesystemtomanycoresusinganoperationlog
AT zeldovichnickolai scalingafilesystemtomanycoresusinganoperationlog