Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees

The log-structured merge tree (LSM-tree) has become an essential component in many key-value systems and expanded its scope to full-fledged database engines (e.g., MyRocks). In the database landscape, vendors face growing customer demands for real-time analytic solutions to handle hybrid transaction...

Full description

Bibliographic Details
Main Authors: Jongbin Kim, Jaechan Ahn, Kitaek Lee, Minsoo Ryu, Hyungsoo Jung
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9940292/
Description
Summary:The log-structured merge tree (LSM-tree) has become an essential component in many key-value systems and expanded its scope to full-fledged database engines (e.g., MyRocks). In the database landscape, vendors face growing customer demands for real-time analytic solutions to handle hybrid transactional/analytical processing (HTAP) workloads that pose significant challenges. Among the challenges is IO amplification that drives system designers to rethink write-optimized engines to survive HTAP loads. This paper follows the same philosophy, reexamines LSM-trees used for database systems, and rethinks IO amplification under HTAP loads to shed some light on practical remedies for upcoming challenges. We propose two practical techniques to alleviate IO amplification: 1) aligned compaction for reducing write amplification, 2) snapshot filters for reducing read amplification. The two techniques are lightweight (i.e., near-zero resource consumption) and are compatible with state-of-the-art methods. We integrated our techniques into RocksDB and demonstrated that the modified RocksDB exhibits reduced IO amplification under HTAP workloads with negligible resource consumption.
ISSN:2169-3536