Entropy-Based Anomaly Detection for Gaussian Mixture Modeling

Gaussian mixture modeling is a generative probabilistic model that assumes that the observed data are generated from a mixture of multiple Gaussian distributions. This mixture model provides a flexible approach to model complex distributions that may not be easily represented by a single Gaussian di...

Full description

Bibliographic Details
Main Author: Luca Scrucca
Format: Article
Language:English
Published: MDPI AG 2023-04-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/16/4/195
_version_ 1797606735757705216
author Luca Scrucca
author_facet Luca Scrucca
author_sort Luca Scrucca
collection DOAJ
description Gaussian mixture modeling is a generative probabilistic model that assumes that the observed data are generated from a mixture of multiple Gaussian distributions. This mixture model provides a flexible approach to model complex distributions that may not be easily represented by a single Gaussian distribution. The Gaussian mixture model with a noise component refers to a finite mixture that includes an additional noise component to model the background noise or outliers in the data. This additional noise component helps to take into account the presence of anomalies or outliers in the data. This latter aspect is crucial for anomaly detection in situations where a clear, early warning of an abnormal condition is required. This paper proposes a novel entropy-based procedure for initializing the noise component in Gaussian mixture models. Our approach is shown to be easy to implement and effective for anomaly detection. We successfully identify anomalies in both simulated and real-world datasets, even in the presence of significant levels of noise and outliers. We provide a step-by-step description of the proposed data analysis process, along with the corresponding R code, which is publicly available in a GitHub repository.
first_indexed 2024-03-11T05:19:17Z
format Article
id doaj.art-c1262e13263d4c76a8126568c024cdd6
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-03-11T05:19:17Z
publishDate 2023-04-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-c1262e13263d4c76a8126568c024cdd62023-11-17T17:59:06ZengMDPI AGAlgorithms1999-48932023-04-0116419510.3390/a16040195Entropy-Based Anomaly Detection for Gaussian Mixture ModelingLuca Scrucca0Department of Economics, Università degli Studi di Perugia, Via A. Pascoli 20, 06123 Perugia, ItalyGaussian mixture modeling is a generative probabilistic model that assumes that the observed data are generated from a mixture of multiple Gaussian distributions. This mixture model provides a flexible approach to model complex distributions that may not be easily represented by a single Gaussian distribution. The Gaussian mixture model with a noise component refers to a finite mixture that includes an additional noise component to model the background noise or outliers in the data. This additional noise component helps to take into account the presence of anomalies or outliers in the data. This latter aspect is crucial for anomaly detection in situations where a clear, early warning of an abnormal condition is required. This paper proposes a novel entropy-based procedure for initializing the noise component in Gaussian mixture models. Our approach is shown to be easy to implement and effective for anomaly detection. We successfully identify anomalies in both simulated and real-world datasets, even in the presence of significant levels of noise and outliers. We provide a step-by-step description of the proposed data analysis process, along with the corresponding R code, which is publicly available in a GitHub repository.https://www.mdpi.com/1999-4893/16/4/195Gaussian mixture modelingcluster analysisnoise componentoutliersentropy of Gaussian mixturesEM algorithm
spellingShingle Luca Scrucca
Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
Algorithms
Gaussian mixture modeling
cluster analysis
noise component
outliers
entropy of Gaussian mixtures
EM algorithm
title Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
title_full Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
title_fullStr Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
title_full_unstemmed Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
title_short Entropy-Based Anomaly Detection for Gaussian Mixture Modeling
title_sort entropy based anomaly detection for gaussian mixture modeling
topic Gaussian mixture modeling
cluster analysis
noise component
outliers
entropy of Gaussian mixtures
EM algorithm
url https://www.mdpi.com/1999-4893/16/4/195
work_keys_str_mv AT lucascrucca entropybasedanomalydetectionforgaussianmixturemodeling