FAIR AI models in high energy physics
The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research...
Main Authors: | , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2023-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/ad12e3 |
_version_ | 1827395823592275968 |
---|---|
author | Javier Duarte Haoyang Li Avik Roy Ruike Zhu E A Huerta Daniel Diaz Philip Harris Raghav Kansal Daniel S Katz Ishaan H Kavoori Volodymyr V Kindratenko Farouk Mokhtar Mark S Neubauer Sang Eon Park Melissa Quinnan Roger Rusack Zhizhen Zhao |
author_facet | Javier Duarte Haoyang Li Avik Roy Ruike Zhu E A Huerta Daniel Diaz Philip Harris Raghav Kansal Daniel S Katz Ishaan H Kavoori Volodymyr V Kindratenko Farouk Mokhtar Mark S Neubauer Sang Eon Park Melissa Quinnan Roger Rusack Zhizhen Zhao |
author_sort | Javier Duarte |
collection | DOAJ |
description | The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning models—algorithms that have been trained on data without being explicitly programmed—and more generally, artificial intelligence (AI) models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template’s use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability. |
first_indexed | 2024-03-08T18:41:24Z |
format | Article |
id | doaj.art-edd2fcfe5d994d0f84af824d77a1d18a |
institution | Directory Open Access Journal |
issn | 2632-2153 |
language | English |
last_indexed | 2024-03-08T18:41:24Z |
publishDate | 2023-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj.art-edd2fcfe5d994d0f84af824d77a1d18a2023-12-29T07:01:12ZengIOP PublishingMachine Learning: Science and Technology2632-21532023-01-014404506210.1088/2632-2153/ad12e3FAIR AI models in high energy physicsJavier Duarte0https://orcid.org/0000-0002-5076-7096Haoyang Li1https://orcid.org/0000-0003-2599-4948Avik Roy2https://orcid.org/0000-0002-0116-1012Ruike Zhu3E A Huerta4https://orcid.org/0000-0002-9682-3604Daniel Diaz5https://orcid.org/0000-0001-6834-1176Philip Harris6https://orcid.org/0000-0001-8189-3741Raghav Kansal7https://orcid.org/0000-0003-2445-1060Daniel S Katz8https://orcid.org/0000-0001-5934-7525Ishaan H Kavoori9Volodymyr V Kindratenko10https://orcid.org/0000-0002-9336-4756Farouk Mokhtar11https://orcid.org/0000-0003-2533-3402Mark S Neubauer12https://orcid.org/0000-0001-8434-9274Sang Eon Park13https://orcid.org/0000-0003-3225-0007Melissa Quinnan14https://orcid.org/0000-0003-2902-5597Roger Rusack15https://orcid.org/0000-0002-7633-749XZhizhen Zhao16University of California San Diego , La Jolla, CA 92093, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of America; Argonne National Laboratory , Lemont, IL 60439, United States of AmericaArgonne National Laboratory , Lemont, IL 60439, United States of America; The University of Chicago , Chicago, IL 60637, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of AmericaMassachusetts Institute of Technology , Cambridge, MA 02139, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of America; Halıcıoğlu Data Science Institute , La Jolla, CA 92093, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of AmericaMassachusetts Institute of Technology , Cambridge, MA 02139, United States of AmericaUniversity of California San Diego , La Jolla, CA 92093, United States of AmericaThe University of Minnesota , Minneapolis, MN 55405, United States of AmericaUniversity of Illinois at Urbana-Champaign , Urbana, IL 61801, United States of AmericaThe findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning models—algorithms that have been trained on data without being explicitly programmed—and more generally, artificial intelligence (AI) models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template’s use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability.https://doi.org/10.1088/2632-2153/ad12e3FAIRAIhigh energy physicsHiggs bosonML |
spellingShingle | Javier Duarte Haoyang Li Avik Roy Ruike Zhu E A Huerta Daniel Diaz Philip Harris Raghav Kansal Daniel S Katz Ishaan H Kavoori Volodymyr V Kindratenko Farouk Mokhtar Mark S Neubauer Sang Eon Park Melissa Quinnan Roger Rusack Zhizhen Zhao FAIR AI models in high energy physics Machine Learning: Science and Technology FAIR AI high energy physics Higgs boson ML |
title | FAIR AI models in high energy physics |
title_full | FAIR AI models in high energy physics |
title_fullStr | FAIR AI models in high energy physics |
title_full_unstemmed | FAIR AI models in high energy physics |
title_short | FAIR AI models in high energy physics |
title_sort | fair ai models in high energy physics |
topic | FAIR AI high energy physics Higgs boson ML |
url | https://doi.org/10.1088/2632-2153/ad12e3 |
work_keys_str_mv | AT javierduarte fairaimodelsinhighenergyphysics AT haoyangli fairaimodelsinhighenergyphysics AT avikroy fairaimodelsinhighenergyphysics AT ruikezhu fairaimodelsinhighenergyphysics AT eahuerta fairaimodelsinhighenergyphysics AT danieldiaz fairaimodelsinhighenergyphysics AT philipharris fairaimodelsinhighenergyphysics AT raghavkansal fairaimodelsinhighenergyphysics AT danielskatz fairaimodelsinhighenergyphysics AT ishaanhkavoori fairaimodelsinhighenergyphysics AT volodymyrvkindratenko fairaimodelsinhighenergyphysics AT faroukmokhtar fairaimodelsinhighenergyphysics AT marksneubauer fairaimodelsinhighenergyphysics AT sangeonpark fairaimodelsinhighenergyphysics AT melissaquinnan fairaimodelsinhighenergyphysics AT rogerrusack fairaimodelsinhighenergyphysics AT zhizhenzhao fairaimodelsinhighenergyphysics |