Scaling a file system to many cores using an operation log

© 2017 Copyright is held by the owner/author(s). It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, even for commutative operations like creating different files in the same directory, current file systems introduce cache-line co...

全面介绍

书目详细资料
Main Authors: Bhat, Srivatsa S., Eqbal, Rasha, Clements, Austin T., Kaashoek, M. Frans, Zeldovich, Nickolai
其他作者: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
格式: 文件
语言:English
出版: Association for Computing Machinery (ACM) 2021
在线阅读:https://hdl.handle.net/1721.1/137612
实物特征
总结:© 2017 Copyright is held by the owner/author(s). It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, even for commutative operations like creating different files in the same directory, current file systems introduce cache-line conflicts when updating an in-memory copy of the on-disk directory block, which limits scalability. ScaleFS is a novel file system design that decouples the in-memory file system from the on-disk file system using per-core operation logs. This design facilitates the use of highly concurrent data structures for the in-memory representation, which allows commutative operations to proceed without cache conflicts and hence scale perfectly. ScaleFS logs operations in a per-core log so that it can delay propagating updates to the disk representation (and the cache-line conflicts involved in doing so) until an fsync. The fsync call merges the per-core logs and applies the operations to disk. ScaleFS uses several techniques to perform the merge correctly while achieving good performance: timestamped linearization points to order updates without introducing cache-line conflicts, absorption of logged operations, and dependency tracking across operations. Experiments with a prototype of ScaleFS show that its implementation has no cache conflicts for 99% of test cases of commutative operations generated by Commuter, scales well on an 80-core machine, and provides on-disk performance that is comparable to that of Linux ext4.