Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence
As a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experimen...
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-01-01
|
Series: | Frontiers in Big Data |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fdata.2021.753409/full |
_version_ | 1798035601173250048 |
---|---|
author | Alessandro Di Girolamo Federica Legger Panos Paparrigopoulos Jaroslava Schovancová Thomas Beermann Michael Boehler Daniele Bonacorsi Daniele Bonacorsi Luca Clissa Luca Clissa Leticia Decker de Sousa Leticia Decker de Sousa Tommaso Diotalevi Tommaso Diotalevi Luca Giommi Luca Giommi Maria Grigorieva Domenico Giordano David Hohn Tomáš Javůrek Stephane Jezequel Valentin Kuznetsov Mario Lassnig Vasilis Mageirakos Micol Olocco Siarhei Padolski Matteo Paltenghi Lorenzo Rinaldi Lorenzo Rinaldi Mayank Sharma Simone Rossi Tisbeni Nikodemas Tuckus |
author_facet | Alessandro Di Girolamo Federica Legger Panos Paparrigopoulos Jaroslava Schovancová Thomas Beermann Michael Boehler Daniele Bonacorsi Daniele Bonacorsi Luca Clissa Luca Clissa Leticia Decker de Sousa Leticia Decker de Sousa Tommaso Diotalevi Tommaso Diotalevi Luca Giommi Luca Giommi Maria Grigorieva Domenico Giordano David Hohn Tomáš Javůrek Stephane Jezequel Valentin Kuznetsov Mario Lassnig Vasilis Mageirakos Micol Olocco Siarhei Padolski Matteo Paltenghi Lorenzo Rinaldi Lorenzo Rinaldi Mayank Sharma Simone Rossi Tisbeni Nikodemas Tuckus |
author_sort | Alessandro Di Girolamo |
collection | DOAJ |
description | As a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experiments have proven to be mature and capable of meeting the experimental goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters, and operational teams is needed to efficiently manage such heterogenous infrastructures. Under the scope of the Operational Intelligence project, experts from several areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. In this community study contribution, we report on the development of a suite of operational intelligence services to cover various use cases: workload management, data management, and site operations. |
first_indexed | 2024-04-11T21:00:26Z |
format | Article |
id | doaj.art-b5518b8e6e7e4d84ad5368aa9e40ebb7 |
institution | Directory Open Access Journal |
issn | 2624-909X |
language | English |
last_indexed | 2024-04-11T21:00:26Z |
publishDate | 2022-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Big Data |
spelling | doaj.art-b5518b8e6e7e4d84ad5368aa9e40ebb72022-12-22T04:03:32ZengFrontiers Media S.A.Frontiers in Big Data2624-909X2022-01-01410.3389/fdata.2021.753409753409Preparing Distributed Computing Operations for the HL-LHC Era With Operational IntelligenceAlessandro Di Girolamo0Federica Legger1Panos Paparrigopoulos2Jaroslava Schovancová3Thomas Beermann4Michael Boehler5Daniele Bonacorsi6Daniele Bonacorsi7Luca Clissa8Luca Clissa9Leticia Decker de Sousa10Leticia Decker de Sousa11Tommaso Diotalevi12Tommaso Diotalevi13Luca Giommi14Luca Giommi15Maria Grigorieva16Domenico Giordano17David Hohn18Tomáš Javůrek19Stephane Jezequel20Valentin Kuznetsov21Mario Lassnig22Vasilis Mageirakos23Micol Olocco24Siarhei Padolski25Matteo Paltenghi26Lorenzo Rinaldi27Lorenzo Rinaldi28Mayank Sharma29Simone Rossi Tisbeni30Nikodemas Tuckus31CERN, Geneva, SwitzerlandINFN Turin, Torino, ItalyCERN, Geneva, SwitzerlandCERN, Geneva, SwitzerlandBergische Universitaet Wuppertal, Wuppertal, GermanyPhysikalisches Institut, Albert-Ludwigs-Universitaet Freiburg, Freiburg, GermanyUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyLomonosov Moscow State University, Moscow, RussiaCERN, Geneva, SwitzerlandPhysikalisches Institut, Albert-Ludwigs-Universitaet Freiburg, Freiburg, GermanyCERN, Geneva, SwitzerlandLAPP, Université Grenoble Alpes, Univrsité. Savoie Mont Blanc, CNRS/IN2P3, Annecy, FranceCornell University, Ithaca, NY, United StatesCERN, Geneva, SwitzerlandCERN, Geneva, SwitzerlandINFN Turin, Torino, Italy0Brookhaven National Laboratory, Upton, NY, United StatesCERN, Geneva, SwitzerlandUniversity of Bologna, Bologna, ItalyINFN Bologna, Bologna, ItalyCERN, Geneva, Switzerland1INFN-CNAF Bologna, Bologna, Italy2Vilnius University, Vilnius, LithuaniaAs a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experiments have proven to be mature and capable of meeting the experimental goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters, and operational teams is needed to efficiently manage such heterogenous infrastructures. Under the scope of the Operational Intelligence project, experts from several areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. In this community study contribution, we report on the development of a suite of operational intelligence services to cover various use cases: workload management, data management, and site operations.https://www.frontiersin.org/articles/10.3389/fdata.2021.753409/fulldistributed computing operationsoperational intelligenceHL-LHCresources optimizationMLNLP |
spellingShingle | Alessandro Di Girolamo Federica Legger Panos Paparrigopoulos Jaroslava Schovancová Thomas Beermann Michael Boehler Daniele Bonacorsi Daniele Bonacorsi Luca Clissa Luca Clissa Leticia Decker de Sousa Leticia Decker de Sousa Tommaso Diotalevi Tommaso Diotalevi Luca Giommi Luca Giommi Maria Grigorieva Domenico Giordano David Hohn Tomáš Javůrek Stephane Jezequel Valentin Kuznetsov Mario Lassnig Vasilis Mageirakos Micol Olocco Siarhei Padolski Matteo Paltenghi Lorenzo Rinaldi Lorenzo Rinaldi Mayank Sharma Simone Rossi Tisbeni Nikodemas Tuckus Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence Frontiers in Big Data distributed computing operations operational intelligence HL-LHC resources optimization ML NLP |
title | Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence |
title_full | Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence |
title_fullStr | Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence |
title_full_unstemmed | Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence |
title_short | Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence |
title_sort | preparing distributed computing operations for the hl lhc era with operational intelligence |
topic | distributed computing operations operational intelligence HL-LHC resources optimization ML NLP |
url | https://www.frontiersin.org/articles/10.3389/fdata.2021.753409/full |
work_keys_str_mv | AT alessandrodigirolamo preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT federicalegger preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT panospaparrigopoulos preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT jaroslavaschovancova preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT thomasbeermann preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT michaelboehler preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT danielebonacorsi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT danielebonacorsi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lucaclissa preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lucaclissa preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT leticiadeckerdesousa preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT leticiadeckerdesousa preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT tommasodiotalevi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT tommasodiotalevi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lucagiommi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lucagiommi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT mariagrigorieva preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT domenicogiordano preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT davidhohn preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT tomasjavurek preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT stephanejezequel preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT valentinkuznetsov preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT mariolassnig preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT vasilismageirakos preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT micololocco preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT siarheipadolski preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT matteopaltenghi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lorenzorinaldi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT lorenzorinaldi preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT mayanksharma preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT simonerossitisbeni preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence AT nikodemastuckus preparingdistributedcomputingoperationsforthehllhcerawithoperationalintelligence |