Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees

The log-structured merge tree (LSM-tree) has become an essential component in many key-value systems and expanded its scope to full-fledged database engines (e.g., MyRocks). In the database landscape, vendors face growing customer demands for real-time analytic solutions to handle hybrid transaction...

Full description

Bibliographic Details
Main Authors: Jongbin Kim, Jaechan Ahn, Kitaek Lee, Minsoo Ryu, Hyungsoo Jung
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9940292/
_version_ 1798018931146883072
author Jongbin Kim
Jaechan Ahn
Kitaek Lee
Minsoo Ryu
Hyungsoo Jung
author_facet Jongbin Kim
Jaechan Ahn
Kitaek Lee
Minsoo Ryu
Hyungsoo Jung
author_sort Jongbin Kim
collection DOAJ
description The log-structured merge tree (LSM-tree) has become an essential component in many key-value systems and expanded its scope to full-fledged database engines (e.g., MyRocks). In the database landscape, vendors face growing customer demands for real-time analytic solutions to handle hybrid transactional/analytical processing (HTAP) workloads that pose significant challenges. Among the challenges is IO amplification that drives system designers to rethink write-optimized engines to survive HTAP loads. This paper follows the same philosophy, reexamines LSM-trees used for database systems, and rethinks IO amplification under HTAP loads to shed some light on practical remedies for upcoming challenges. We propose two practical techniques to alleviate IO amplification: 1) aligned compaction for reducing write amplification, 2) snapshot filters for reducing read amplification. The two techniques are lightweight (i.e., near-zero resource consumption) and are compatible with state-of-the-art methods. We integrated our techniques into RocksDB and demonstrated that the modified RocksDB exhibits reduced IO amplification under HTAP workloads with negligible resource consumption.
first_indexed 2024-04-11T16:32:28Z
format Article
id doaj.art-6145d2486d6d472cb2c4cb7d67ee6441
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T16:32:28Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6145d2486d6d472cb2c4cb7d67ee64412022-12-22T04:13:59ZengIEEEIEEE Access2169-35362022-01-011011762611763710.1109/ACCESS.2022.32198599940292Hybrid Transactional/Analytical Processing Amplifies IO in LSM-TreesJongbin Kim0Jaechan Ahn1Kitaek Lee2https://orcid.org/0000-0003-1641-6556Minsoo Ryu3https://orcid.org/0000-0002-4137-3052Hyungsoo Jung4https://orcid.org/0000-0002-5376-7200Department of Computer Science, Hanyang University, Seoul, South KoreaDepartment of Computer Science, Hanyang University, Seoul, South KoreaDepartment of Computer Science, Hanyang University, Seoul, South KoreaDepartment of Computer Science, Hanyang University, Seoul, South KoreaDepartment of Computer Science, Hanyang University, Seoul, South KoreaThe log-structured merge tree (LSM-tree) has become an essential component in many key-value systems and expanded its scope to full-fledged database engines (e.g., MyRocks). In the database landscape, vendors face growing customer demands for real-time analytic solutions to handle hybrid transactional/analytical processing (HTAP) workloads that pose significant challenges. Among the challenges is IO amplification that drives system designers to rethink write-optimized engines to survive HTAP loads. This paper follows the same philosophy, reexamines LSM-trees used for database systems, and rethinks IO amplification under HTAP loads to shed some light on practical remedies for upcoming challenges. We propose two practical techniques to alleviate IO amplification: 1) aligned compaction for reducing write amplification, 2) snapshot filters for reducing read amplification. The two techniques are lightweight (i.e., near-zero resource consumption) and are compatible with state-of-the-art methods. We integrated our techniques into RocksDB and demonstrated that the modified RocksDB exhibits reduced IO amplification under HTAP workloads with negligible resource consumption.https://ieeexplore.ieee.org/document/9940292/LSM-treekey-value storeMVCCHTAPI/O amplification
spellingShingle Jongbin Kim
Jaechan Ahn
Kitaek Lee
Minsoo Ryu
Hyungsoo Jung
Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
IEEE Access
LSM-tree
key-value store
MVCC
HTAP
I/O amplification
title Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
title_full Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
title_fullStr Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
title_full_unstemmed Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
title_short Hybrid Transactional/Analytical Processing Amplifies IO in LSM-Trees
title_sort hybrid transactional analytical processing amplifies io in lsm trees
topic LSM-tree
key-value store
MVCC
HTAP
I/O amplification
url https://ieeexplore.ieee.org/document/9940292/
work_keys_str_mv AT jongbinkim hybridtransactionalanalyticalprocessingamplifiesioinlsmtrees
AT jaechanahn hybridtransactionalanalyticalprocessingamplifiesioinlsmtrees
AT kitaeklee hybridtransactionalanalyticalprocessingamplifiesioinlsmtrees
AT minsooryu hybridtransactionalanalyticalprocessingamplifiesioinlsmtrees
AT hyungsoojung hybridtransactionalanalyticalprocessingamplifiesioinlsmtrees