CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2020-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf |
_version_ | 1818891154493538304 |
---|---|
author | Delgado Peris Antonio Flix Molina José Hernández José M. Pérez-Calero Yzquierdo Antonio Pérez Dengra Carlos Planas Elena Rodríguez Calonge Francisco Javier Sikora Anna |
author_facet | Delgado Peris Antonio Flix Molina José Hernández José M. Pérez-Calero Yzquierdo Antonio Pérez Dengra Carlos Planas Elena Rodríguez Calonge Francisco Javier Sikora Anna |
author_sort | Delgado Peris Antonio |
collection | DOAJ |
description | The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software applications, as well as to propose more efficient resource deployment scenarios and data management techniques, which might reduce this expected increase of resources. The massively increasing amounts of data to be processed leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. These challenges are particularly important in scenarios in which the LHC data would be distributed from small numbers of centers holding the experiment’s data. Enabling data locality relative to computing tasks via local caches on sites seems a very promising approach to hide transfer latencies while reducing the deployed storage space and number of replicas overall. However, this highly depends on the workflow I/O characteristics and available network across sites. A crucial assessment of how the experiments are accessing and using the storage services deployed in WLCG sites, to evaluate and simulate the benefits for several of the new emerging proposals within WLCG/HSF. Studies on access and usage of storage, data access and popularity studies for the CMS workflows executed in the Spanish Tier-1 (PIC) and Tier-2 (CIEMAT) sites supporting CMS activities are reviewed in this report, based on local and experiment monitoring data, spanning more than one year. This is of relevance for simulation of data caches for end-user analysis data, as well as identifying potential areas for storage savings. |
first_indexed | 2024-12-19T17:36:18Z |
format | Article |
id | doaj.art-1c4de6e9abe440119130a4c942a41b1f |
institution | Directory Open Access Journal |
issn | 2100-014X |
language | English |
last_indexed | 2024-12-19T17:36:18Z |
publishDate | 2020-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | EPJ Web of Conferences |
spelling | doaj.art-1c4de6e9abe440119130a4c942a41b1f2022-12-21T20:12:19ZengEDP SciencesEPJ Web of Conferences2100-014X2020-01-012450402810.1051/epjconf/202024504028epjconf_chep2020_04028CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2Delgado Peris Antonio0Flix Molina JoséHernández José M.1Pérez-Calero Yzquierdo AntonioPérez Dengra CarlosPlanas Elena2Rodríguez Calonge Francisco Javier3Sikora Anna4Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Institut de Física d’Altes Energíes (IFAE)Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT)Universitat Autònoma de Barcelona (UAB)The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software applications, as well as to propose more efficient resource deployment scenarios and data management techniques, which might reduce this expected increase of resources. The massively increasing amounts of data to be processed leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. These challenges are particularly important in scenarios in which the LHC data would be distributed from small numbers of centers holding the experiment’s data. Enabling data locality relative to computing tasks via local caches on sites seems a very promising approach to hide transfer latencies while reducing the deployed storage space and number of replicas overall. However, this highly depends on the workflow I/O characteristics and available network across sites. A crucial assessment of how the experiments are accessing and using the storage services deployed in WLCG sites, to evaluate and simulate the benefits for several of the new emerging proposals within WLCG/HSF. Studies on access and usage of storage, data access and popularity studies for the CMS workflows executed in the Spanish Tier-1 (PIC) and Tier-2 (CIEMAT) sites supporting CMS activities are reviewed in this report, based on local and experiment monitoring data, spanning more than one year. This is of relevance for simulation of data caches for end-user analysis data, as well as identifying potential areas for storage savings.https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf |
spellingShingle | Delgado Peris Antonio Flix Molina José Hernández José M. Pérez-Calero Yzquierdo Antonio Pérez Dengra Carlos Planas Elena Rodríguez Calonge Francisco Javier Sikora Anna CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 EPJ Web of Conferences |
title | CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 |
title_full | CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 |
title_fullStr | CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 |
title_full_unstemmed | CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 |
title_short | CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2 |
title_sort | cms data access and usage studies at pic tier 1 and ciemat tier 2 |
url | https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_04028.pdf |
work_keys_str_mv | AT delgadoperisantonio cmsdataaccessandusagestudiesatpictier1andciemattier2 AT flixmolinajose cmsdataaccessandusagestudiesatpictier1andciemattier2 AT hernandezjosem cmsdataaccessandusagestudiesatpictier1andciemattier2 AT perezcaleroyzquierdoantonio cmsdataaccessandusagestudiesatpictier1andciemattier2 AT perezdengracarlos cmsdataaccessandusagestudiesatpictier1andciemattier2 AT planaselena cmsdataaccessandusagestudiesatpictier1andciemattier2 AT rodriguezcalongefranciscojavier cmsdataaccessandusagestudiesatpictier1andciemattier2 AT sikoraanna cmsdataaccessandusagestudiesatpictier1andciemattier2 |