Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1

CNAF is the national center of INFN (Italian National Institute for Nuclear Physics) for IT technology services. The Tier-1 data center operated at CNAF offers computing and storage resources to scientific communities as those working on the four experiments of LHC (Large Hadron Collider) at CERN an...

Full description

Bibliographic Details
Main Authors: Dal Pra Stefano, Falabella Antonio, Fattibene Enrico, Cincinelli Gianluca, Magnani Matteo, De Cristofaro Tiziano, Ruini Martin
Format: Article
Language:English
Published: EDP Sciences 2019-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_08033.pdf
_version_ 1818649446274039808
author Dal Pra Stefano
Falabella Antonio
Fattibene Enrico
Cincinelli Gianluca
Magnani Matteo
De Cristofaro Tiziano
Ruini Martin
author_facet Dal Pra Stefano
Falabella Antonio
Fattibene Enrico
Cincinelli Gianluca
Magnani Matteo
De Cristofaro Tiziano
Ruini Martin
author_sort Dal Pra Stefano
collection DOAJ
description CNAF is the national center of INFN (Italian National Institute for Nuclear Physics) for IT technology services. The Tier-1 data center operated at CNAF offers computing and storage resources to scientific communities as those working on the four experiments of LHC (Large Hadron Collider) at CERN and other 30 experiments in which INFN is involved. In past years, monitoring and alerting services for Tier-1 resources were performed with several software, such as LEMON (developed at CERN and customized on the char-acteristics of datacenters managing scientific data), Nagios (especially used for alerting purposes) and a system based on Graphite database and other ad-hoc developed services and web pages. By 2015, a task force has been organized with the purpose of defining and deploying a common infrastructure (based on Sensu, InfluxDB and Grafana) to be exploited by the different CNAF depart-ments. Once the new infrastructure was deployed, a major task was then to adapt the whole monitoring and alerting services. We are going to present the steps that the Tier-1 group followed in order to accomplish a full migration, that is now completed with all the new services in production. In particular we will show the monitoring sensors and alerting checks redesign to adapt them to the infrastructure base on the Sensu software, the web dashboards creation for data presentation, the porting of historical data from LEMON/Graphite to InfluxDB.
first_indexed 2024-12-17T01:34:27Z
format Article
id doaj.art-8cad165330b046518f4793050c2532da
institution Directory Open Access Journal
issn 2100-014X
language English
last_indexed 2024-12-17T01:34:27Z
publishDate 2019-01-01
publisher EDP Sciences
record_format Article
series EPJ Web of Conferences
spelling doaj.art-8cad165330b046518f4793050c2532da2022-12-21T22:08:29ZengEDP SciencesEPJ Web of Conferences2100-014X2019-01-012140803310.1051/epjconf/201921408033epjconf_chep2018_08033Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1Dal Pra StefanoFalabella AntonioFattibene EnricoCincinelli GianlucaMagnani MatteoDe Cristofaro TizianoRuini MartinCNAF is the national center of INFN (Italian National Institute for Nuclear Physics) for IT technology services. The Tier-1 data center operated at CNAF offers computing and storage resources to scientific communities as those working on the four experiments of LHC (Large Hadron Collider) at CERN and other 30 experiments in which INFN is involved. In past years, monitoring and alerting services for Tier-1 resources were performed with several software, such as LEMON (developed at CERN and customized on the char-acteristics of datacenters managing scientific data), Nagios (especially used for alerting purposes) and a system based on Graphite database and other ad-hoc developed services and web pages. By 2015, a task force has been organized with the purpose of defining and deploying a common infrastructure (based on Sensu, InfluxDB and Grafana) to be exploited by the different CNAF depart-ments. Once the new infrastructure was deployed, a major task was then to adapt the whole monitoring and alerting services. We are going to present the steps that the Tier-1 group followed in order to accomplish a full migration, that is now completed with all the new services in production. In particular we will show the monitoring sensors and alerting checks redesign to adapt them to the infrastructure base on the Sensu software, the web dashboards creation for data presentation, the porting of historical data from LEMON/Graphite to InfluxDB.https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_08033.pdf
spellingShingle Dal Pra Stefano
Falabella Antonio
Fattibene Enrico
Cincinelli Gianluca
Magnani Matteo
De Cristofaro Tiziano
Ruini Martin
Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
EPJ Web of Conferences
title Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
title_full Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
title_fullStr Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
title_full_unstemmed Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
title_short Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
title_sort evolution of monitoring accounting and alerting services at infn cnaf tier 1
url https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_08033.pdf
work_keys_str_mv AT dalprastefano evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT falabellaantonio evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT fattibeneenrico evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT cincinelligianluca evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT magnanimatteo evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT decristofarotiziano evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1
AT ruinimartin evolutionofmonitoringaccountingandalertingservicesatinfncnaftier1