CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2

The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities...

Full description

Bibliographic Details
Main Authors: Delgado Peris Antonio, Flix Molina José, Hernández José M., Pérez-Calero Yzquierdo Antonio, Pérez Dengra Carlos, Planas Elena, Rodríguez Calonge Francisco Javier, Sikora Anna
Format: Article
Language:English
Published: EDP Sciences 2020-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf
_version_ 1818891154493538304
author Delgado Peris Antonio
Flix Molina José
Hernández José M.
Pérez-Calero Yzquierdo Antonio
Pérez Dengra Carlos
Planas Elena
Rodríguez Calonge Francisco Javier
Sikora Anna
author_facet Delgado Peris Antonio
Flix Molina José
Hernández José M.
Pérez-Calero Yzquierdo Antonio
Pérez Dengra Carlos
Planas Elena
Rodríguez Calonge Francisco Javier
Sikora Anna
author_sort Delgado Peris Antonio
collection DOAJ
description The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software applications, as well as to propose more efficient resource deployment scenarios and data management techniques, which might reduce this expected increase of resources. The massively increasing amounts of data to be processed leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. These challenges are particularly important in scenarios in which the LHC data would be distributed from small numbers of centers holding the experiment’s data. Enabling data locality relative to computing tasks via local caches on sites seems a very promising approach to hide transfer latencies while reducing the deployed storage space and number of replicas overall. However, this highly depends on the workflow I/O characteristics and available network across sites. A crucial assessment of how the experiments are accessing and using the storage services deployed in WLCG sites, to evaluate and simulate the benefits for several of the new emerging proposals within WLCG/HSF. Studies on access and usage of storage, data access and popularity studies for the CMS workflows executed in the Spanish Tier-1 (PIC) and Tier-2 (CIEMAT) sites supporting CMS activities are reviewed in this report, based on local and experiment monitoring data, spanning more than one year. This is of relevance for simulation of data caches for end-user analysis data, as well as identifying potential areas for storage savings.
first_indexed 2024-12-19T17:36:18Z
format Article
id doaj.art-1c4de6e9abe440119130a4c942a41b1f
institution Directory Open Access Journal
issn 2100-014X
language English
last_indexed 2024-12-19T17:36:18Z
publishDate 2020-01-01
publisher EDP Sciences
record_format Article
series EPJ Web of Conferences
spelling doaj.art-1c4de6e9abe440119130a4c942a41b1f2022-12-21T20:12:19ZengEDP SciencesEPJ Web of Conferences2100-014X2020-01-012450402810.1051/epjconf/202024504028epjconf_chep2020_04028CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2Delgado Peris Antonio0Flix Molina JoséHernández José M.1Pérez-Calero Yzquierdo AntonioPérez Dengra CarlosPlanas Elena2Rodríguez Calonge Francisco Javier3Sikora Anna4Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Institut de Física d’Altes Energíes (IFAE)Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Universitat Autònoma de Barcelona (UAB)The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software applications, as well as to propose more efficient resource deployment scenarios and data management techniques, which might reduce this expected increase of resources. The massively increasing amounts of data to be processed leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. These challenges are particularly important in scenarios in which the LHC data would be distributed from small numbers of centers holding the experiment’s data. Enabling data locality relative to computing tasks via local caches on sites seems a very promising approach to hide transfer latencies while reducing the deployed storage space and number of replicas overall. However, this highly depends on the workflow I/O characteristics and available network across sites. A crucial assessment of how the experiments are accessing and using the storage services deployed in WLCG sites, to evaluate and simulate the benefits for several of the new emerging proposals within WLCG/HSF. Studies on access and usage of storage, data access and popularity studies for the CMS workflows executed in the Spanish Tier-1 (PIC) and Tier-2 (CIEMAT) sites supporting CMS activities are reviewed in this report, based on local and experiment monitoring data, spanning more than one year. This is of relevance for simulation of data caches for end-user analysis data, as well as identifying potential areas for storage savings.https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf
spellingShingle Delgado Peris Antonio
Flix Molina José
Hernández José M.
Pérez-Calero Yzquierdo Antonio
Pérez Dengra Carlos
Planas Elena
Rodríguez Calonge Francisco Javier
Sikora Anna
CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
EPJ Web of Conferences
title CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
title_full CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
title_fullStr CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
title_full_unstemmed CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
title_short CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
title_sort cms data access and usage studies at pic tier 1 and ciemat tier 2
url https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf
work_keys_str_mv AT delgadoperisantonio cmsdataaccessandusagestudiesatpictier1andciemattier2
AT flixmolinajose cmsdataaccessandusagestudiesatpictier1andciemattier2
AT hernandezjosem cmsdataaccessandusagestudiesatpictier1andciemattier2
AT perezcaleroyzquierdoantonio cmsdataaccessandusagestudiesatpictier1andciemattier2
AT perezdengracarlos cmsdataaccessandusagestudiesatpictier1andciemattier2
AT planaselena cmsdataaccessandusagestudiesatpictier1andciemattier2
AT rodriguezcalongefranciscojavier cmsdataaccessandusagestudiesatpictier1andciemattier2
AT sikoraanna cmsdataaccessandusagestudiesatpictier1andciemattier2