Towards a responsive CernVM-FS architecture

The CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distribute...

Full description

Bibliographic Details
Main Authors: Popescu Radu, Blomer Jakob, Ganis Gerardo
Format: Article
Language:English
Published: EDP Sciences 2019-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf
_version_ 1818717538420260864
author Popescu Radu
Blomer Jakob
Ganis Gerardo
author_facet Popescu Radu
Blomer Jakob
Ganis Gerardo
author_sort Popescu Radu
collection DOAJ
description The CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distributed computing infrastructure for data processing applications. Files are stored remotely as content-addressed blocks on standard web servers and are retrieved and cached on-demand through outgoing HTTP connections only. Repository metadata is recorded in SQLite catalogs, which represent implicit Merkle treeencodings of the repository state. For writing, CernVM-FS follows a publish-subscribe pattern with a single source of new content that is propagated to a large number of readers. This paper focuses on the work to move the CernVM-FS architecturein the direction of a responsive data distribution system. A new distributed publication backend allows scaling out large publication tasks across multiple machines, reducing the time to publish. For the faster propagation of new published content, the addition of a notification system allows clients to subscribe to messages about changes in the repository and to request new root catalogs as soon as they become available. These devel-opments make CernVM-FS more responsive and are particularly relevant for use cases where a short propagation delay from repository down to individual clients is important, such as using CernVM-FS as an AFS replacement for distributing software stacks. Additionally, they permit the implementation of more complex workflows, with producer-consumer pipelines, as for example in the ALICE analysis trains system.
first_indexed 2024-12-17T19:36:45Z
format Article
id doaj.art-55768eed77774ce788678b002cfece9d
institution Directory Open Access Journal
issn 2100-014X
language English
last_indexed 2024-12-17T19:36:45Z
publishDate 2019-01-01
publisher EDP Sciences
record_format Article
series EPJ Web of Conferences
spelling doaj.art-55768eed77774ce788678b002cfece9d2022-12-21T21:35:07ZengEDP SciencesEPJ Web of Conferences2100-014X2019-01-012140303610.1051/epjconf/201921403036epjconf_chep2018_03036Towards a responsive CernVM-FS architecturePopescu RaduBlomer JakobGanis GerardoThe CernVM File System (CernVM-FS) provides a scalable and reliable software distribution service implemented as a POSIX read-only filesystem in user space (FUSE). It was originally developed at CERN to assist High Energy Physics (HEP) collaborations in deploying software on the worldwide distributed computing infrastructure for data processing applications. Files are stored remotely as content-addressed blocks on standard web servers and are retrieved and cached on-demand through outgoing HTTP connections only. Repository metadata is recorded in SQLite catalogs, which represent implicit Merkle treeencodings of the repository state. For writing, CernVM-FS follows a publish-subscribe pattern with a single source of new content that is propagated to a large number of readers. This paper focuses on the work to move the CernVM-FS architecturein the direction of a responsive data distribution system. A new distributed publication backend allows scaling out large publication tasks across multiple machines, reducing the time to publish. For the faster propagation of new published content, the addition of a notification system allows clients to subscribe to messages about changes in the repository and to request new root catalogs as soon as they become available. These devel-opments make CernVM-FS more responsive and are particularly relevant for use cases where a short propagation delay from repository down to individual clients is important, such as using CernVM-FS as an AFS replacement for distributing software stacks. Additionally, they permit the implementation of more complex workflows, with producer-consumer pipelines, as for example in the ALICE analysis trains system.https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf
spellingShingle Popescu Radu
Blomer Jakob
Ganis Gerardo
Towards a responsive CernVM-FS architecture
EPJ Web of Conferences
title Towards a responsive CernVM-FS architecture
title_full Towards a responsive CernVM-FS architecture
title_fullStr Towards a responsive CernVM-FS architecture
title_full_unstemmed Towards a responsive CernVM-FS architecture
title_short Towards a responsive CernVM-FS architecture
title_sort towards a responsive cernvm fs architecture
url https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03036.pdf
work_keys_str_mv AT popescuradu towardsaresponsivecernvmfsarchitecture
AT blomerjakob towardsaresponsivecernvmfsarchitecture
AT ganisgerardo towardsaresponsivecernvmfsarchitecture