Null hypothesis test for anomaly detection

We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying...

Full description

Bibliographic Details
Main Authors: Jernej F. Kamenik, Manuel Szewc
Format: Article
Language:English
Published: Elsevier 2023-05-01
Series:Physics Letters B
Online Access:http://www.sciencedirect.com/science/article/pii/S0370269323001703
_version_ 1797843898674970624
author Jernej F. Kamenik
Manuel Szewc
author_facet Jernej F. Kamenik
Manuel Szewc
author_sort Jernej F. Kamenik
collection DOAJ
description We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.
first_indexed 2024-04-09T17:13:25Z
format Article
id doaj.art-0dea53a32f6f42f9964fc98045199df5
institution Directory Open Access Journal
issn 0370-2693
language English
last_indexed 2024-04-09T17:13:25Z
publishDate 2023-05-01
publisher Elsevier
record_format Article
series Physics Letters B
spelling doaj.art-0dea53a32f6f42f9964fc98045199df52023-04-20T04:35:29ZengElsevierPhysics Letters B0370-26932023-05-01840137836Null hypothesis test for anomaly detectionJernej F. Kamenik0Manuel Szewc1Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia; Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, 1000 Ljubljana, SloveniaJožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia; Corresponding author.We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.http://www.sciencedirect.com/science/article/pii/S0370269323001703
spellingShingle Jernej F. Kamenik
Manuel Szewc
Null hypothesis test for anomaly detection
Physics Letters B
title Null hypothesis test for anomaly detection
title_full Null hypothesis test for anomaly detection
title_fullStr Null hypothesis test for anomaly detection
title_full_unstemmed Null hypothesis test for anomaly detection
title_short Null hypothesis test for anomaly detection
title_sort null hypothesis test for anomaly detection
url http://www.sciencedirect.com/science/article/pii/S0370269323001703
work_keys_str_mv AT jernejfkamenik nullhypothesistestforanomalydetection
AT manuelszewc nullhypothesistestforanomalydetection