Null hypothesis test for anomaly detection
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-05-01
|
Series: | Physics Letters B |
Online Access: | http://www.sciencedirect.com/science/article/pii/S0370269323001703 |
_version_ | 1797843898674970624 |
---|---|
author | Jernej F. Kamenik Manuel Szewc |
author_facet | Jernej F. Kamenik Manuel Szewc |
author_sort | Jernej F. Kamenik |
collection | DOAJ |
description | We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations. |
first_indexed | 2024-04-09T17:13:25Z |
format | Article |
id | doaj.art-0dea53a32f6f42f9964fc98045199df5 |
institution | Directory Open Access Journal |
issn | 0370-2693 |
language | English |
last_indexed | 2024-04-09T17:13:25Z |
publishDate | 2023-05-01 |
publisher | Elsevier |
record_format | Article |
series | Physics Letters B |
spelling | doaj.art-0dea53a32f6f42f9964fc98045199df52023-04-20T04:35:29ZengElsevierPhysics Letters B0370-26932023-05-01840137836Null hypothesis test for anomaly detectionJernej F. Kamenik0Manuel Szewc1Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia; Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, 1000 Ljubljana, SloveniaJožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia; Corresponding author.We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.http://www.sciencedirect.com/science/article/pii/S0370269323001703 |
spellingShingle | Jernej F. Kamenik Manuel Szewc Null hypothesis test for anomaly detection Physics Letters B |
title | Null hypothesis test for anomaly detection |
title_full | Null hypothesis test for anomaly detection |
title_fullStr | Null hypothesis test for anomaly detection |
title_full_unstemmed | Null hypothesis test for anomaly detection |
title_short | Null hypothesis test for anomaly detection |
title_sort | null hypothesis test for anomaly detection |
url | http://www.sciencedirect.com/science/article/pii/S0370269323001703 |
work_keys_str_mv | AT jernejfkamenik nullhypothesistestforanomalydetection AT manuelszewc nullhypothesistestforanomalydetection |