Towards a responsive CernVM-FS architecture
The CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distribute...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2019-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf |
_version_ | 1818717538420260864 |
---|---|
author | Popescu Radu Blomer Jakob Ganis Gerardo |
author_facet | Popescu Radu Blomer Jakob Ganis Gerardo |
author_sort | Popescu Radu |
collection | DOAJ |
description | The CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distributed computing infrastructure for data processing applications. Files are stored remotely as content-addressed blocks on standard web servers and are retrieved and cached on-demand through outgoing HTTP connections only. Repository metadata is recorded in SQLite catalogs, which represent implicit Merkle treeencodings of the repository state. For writing, CernVM-FS follows a publish-subscribe pattern with a single source of new content that is propagated to a large number of readers. This paper focuses on the work to move the CernVM-FS architecturein the direction of a responsive data distribution system. A new distributed publication backend allows scaling out large publication tasks across multiple machines, reducing the time to publish. For the faster propagation of new published content, the addition of a notification system allows clients to subscribe to messages about changes in the repository and to request new root catalogs as soon as they become available. These devel-opments make CernVM-FS more responsive and are particularly relevant for use cases where a short propagation delay from repository down to individual clients is important, such as using CernVM-FS as an AFS replacement for distributing software stacks. Additionally, they permit the implementation of more complex workflows, with producer-consumer pipelines, as for example in the ALICE analysis trains system. |
first_indexed | 2024-12-17T19:36:45Z |
format | Article |
id | doaj.art-55768eed77774ce788678b002cfece9d |
institution | Directory Open Access Journal |
issn | 2100-014X |
language | English |
last_indexed | 2024-12-17T19:36:45Z |
publishDate | 2019-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | EPJ Web of Conferences |
spelling | doaj.art-55768eed77774ce788678b002cfece9d2022-12-21T21:35:07ZengEDP SciencesEPJ Web of Conferences2100-014X2019-01-012140303610.1051/epjconf/201921403036epjconf_chep2018_03036Towards a responsive CernVM-FS architecturePopescu RaduBlomer JakobGanis GerardoThe CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distributed computing infrastructure for data processing applications. Files are stored remotely as content-addressed blocks on standard web servers and are retrieved and cached on-demand through outgoing HTTP connections only. Repository metadata is recorded in SQLite catalogs, which represent implicit Merkle treeencodings of the repository state. For writing, CernVM-FS follows a publish-subscribe pattern with a single source of new content that is propagated to a large number of readers. This paper focuses on the work to move the CernVM-FS architecturein the direction of a responsive data distribution system. A new distributed publication backend allows scaling out large publication tasks across multiple machines, reducing the time to publish. For the faster propagation of new published content, the addition of a notification system allows clients to subscribe to messages about changes in the repository and to request new root catalogs as soon as they become available. These devel-opments make CernVM-FS more responsive and are particularly relevant for use cases where a short propagation delay from repository down to individual clients is important, such as using CernVM-FS as an AFS replacement for distributing software stacks. Additionally, they permit the implementation of more complex workflows, with producer-consumer pipelines, as for example in the ALICE analysis trains system.https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf |
spellingShingle | Popescu Radu Blomer Jakob Ganis Gerardo Towards a responsive CernVM-FS architecture EPJ Web of Conferences |
title | Towards a responsive CernVM-FS architecture |
title_full | Towards a responsive CernVM-FS architecture |
title_fullStr | Towards a responsive CernVM-FS architecture |
title_full_unstemmed | Towards a responsive CernVM-FS architecture |
title_short | Towards a responsive CernVM-FS architecture |
title_sort | towards a responsive cernvm fs architecture |
url | https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf |
work_keys_str_mv | AT popescuradu towardsaresponsivecernvmfsarchitecture AT blomerjakob towardsaresponsivecernvmfsarchitecture AT ganisgerardo towardsaresponsivecernvmfsarchitecture |