Exploring Object Stores for High-Energy Physics Data Storage
Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for e...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2021-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_02066.pdf |
_version_ | 1819090611282640896 |
---|---|
author | López-Gómez Javier Blomer Jakob |
author_facet | López-Gómez Javier Blomer Jakob |
author_sort | López-Gómez Javier |
collection | DOAJ |
description | Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree’s limitations and at providing improved effciency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project. |
first_indexed | 2024-12-21T22:26:35Z |
format | Article |
id | doaj.art-f5d8f2c6f3104ab1b80175ce4fd0bd43 |
institution | Directory Open Access Journal |
issn | 2100-014X |
language | English |
last_indexed | 2024-12-21T22:26:35Z |
publishDate | 2021-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | EPJ Web of Conferences |
spelling | doaj.art-f5d8f2c6f3104ab1b80175ce4fd0bd432022-12-21T18:48:13ZengEDP SciencesEPJ Web of Conferences2100-014X2021-01-012510206610.1051/epjconf/202125102066epjconf_chep2021_02066Exploring Object Stores for High-Energy Physics Data StorageLópez-Gómez Javier0Blomer Jakob1CERNCERNOver the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree’s limitations and at providing improved effciency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_02066.pdf |
spellingShingle | López-Gómez Javier Blomer Jakob Exploring Object Stores for High-Energy Physics Data Storage EPJ Web of Conferences |
title | Exploring Object Stores for High-Energy Physics Data Storage |
title_full | Exploring Object Stores for High-Energy Physics Data Storage |
title_fullStr | Exploring Object Stores for High-Energy Physics Data Storage |
title_full_unstemmed | Exploring Object Stores for High-Energy Physics Data Storage |
title_short | Exploring Object Stores for High-Energy Physics Data Storage |
title_sort | exploring object stores for high energy physics data storage |
url | https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_02066.pdf |
work_keys_str_mv | AT lopezgomezjavier exploringobjectstoresforhighenergyphysicsdatastorage AT blomerjakob exploringobjectstoresforhighenergyphysicsdatastorage |