Machine-learned exclusion limits without binning

Abstract Machine-learned likelihoods (MLL) combines machine-learning classification techniques with likelihood-based inference tests to estimate the experimental sensitivity of high-dimensional data sets. We extend the MLL method by including kernel density estimators (KDE) to avoid binning the clas...

Full description

Bibliographic Details
Main Authors: Ernesto Arganda, Andres D. Perez, Martín de los Rios, Rosa María Sandá Seoane
Format: Article
Language:English
Published: SpringerOpen 2023-12-01
Series:European Physical Journal C: Particles and Fields
Online Access:https://doi.org/10.1140/epjc/s10052-023-12314-z
_version_ 1797233306019823616
author Ernesto Arganda
Andres D. Perez
Martín de los Rios
Rosa María Sandá Seoane
author_facet Ernesto Arganda
Andres D. Perez
Martín de los Rios
Rosa María Sandá Seoane
author_sort Ernesto Arganda
collection DOAJ
description Abstract Machine-learned likelihoods (MLL) combines machine-learning classification techniques with likelihood-based inference tests to estimate the experimental sensitivity of high-dimensional data sets. We extend the MLL method by including kernel density estimators (KDE) to avoid binning the classifier output to extract the resulting one-dimensional signal and background probability density functions. We first test our method on toy models generated with multivariate Gaussian distributions, where the true probability distribution functions are known. Later, we apply the method to two cases of interest at the LHC: a search for exotic Higgs bosons, and a $$Z'$$ Z ′ boson decaying into lepton pairs. In contrast to physical-based quantities, the typical fluctuations of the ML outputs give non-smooth probability distributions for pure-signal and pure-background samples. The non-smoothness is propagated into the density estimation due to the good performance and flexibility of the KDE method. We study its impact on the final significance computation, and we compare the results using the average of several independent ML output realizations, which allows us to obtain smoother distributions. We conclude that the significance estimation turns out to be not sensible to this issue.
first_indexed 2024-03-08T19:44:11Z
format Article
id doaj.art-27abc3497fb14803b94a5c3ee3930c74
institution Directory Open Access Journal
issn 1434-6052
language English
last_indexed 2024-04-24T16:14:04Z
publishDate 2023-12-01
publisher SpringerOpen
record_format Article
series European Physical Journal C: Particles and Fields
spelling doaj.art-27abc3497fb14803b94a5c3ee3930c742024-03-31T11:30:59ZengSpringerOpenEuropean Physical Journal C: Particles and Fields1434-60522023-12-01831211410.1140/epjc/s10052-023-12314-zMachine-learned exclusion limits without binningErnesto Arganda0Andres D. Perez1Martín de los Rios2Rosa María Sandá Seoane3Departamento de Física Teórica, Universidad Autónoma de MadridDepartamento de Física Teórica, Universidad Autónoma de MadridDepartamento de Física Teórica, Universidad Autónoma de MadridInstituto de Física Teórica UAM-CSICAbstract Machine-learned likelihoods (MLL) combines machine-learning classification techniques with likelihood-based inference tests to estimate the experimental sensitivity of high-dimensional data sets. We extend the MLL method by including kernel density estimators (KDE) to avoid binning the classifier output to extract the resulting one-dimensional signal and background probability density functions. We first test our method on toy models generated with multivariate Gaussian distributions, where the true probability distribution functions are known. Later, we apply the method to two cases of interest at the LHC: a search for exotic Higgs bosons, and a $$Z'$$ Z ′ boson decaying into lepton pairs. In contrast to physical-based quantities, the typical fluctuations of the ML outputs give non-smooth probability distributions for pure-signal and pure-background samples. The non-smoothness is propagated into the density estimation due to the good performance and flexibility of the KDE method. We study its impact on the final significance computation, and we compare the results using the average of several independent ML output realizations, which allows us to obtain smoother distributions. We conclude that the significance estimation turns out to be not sensible to this issue.https://doi.org/10.1140/epjc/s10052-023-12314-z
spellingShingle Ernesto Arganda
Andres D. Perez
Martín de los Rios
Rosa María Sandá Seoane
Machine-learned exclusion limits without binning
European Physical Journal C: Particles and Fields
title Machine-learned exclusion limits without binning
title_full Machine-learned exclusion limits without binning
title_fullStr Machine-learned exclusion limits without binning
title_full_unstemmed Machine-learned exclusion limits without binning
title_short Machine-learned exclusion limits without binning
title_sort machine learned exclusion limits without binning
url https://doi.org/10.1140/epjc/s10052-023-12314-z
work_keys_str_mv AT ernestoarganda machinelearnedexclusionlimitswithoutbinning
AT andresdperez machinelearnedexclusionlimitswithoutbinning
AT martindelosrios machinelearnedexclusionlimitswithoutbinning
AT rosamariasandaseoane machinelearnedexclusionlimitswithoutbinning