A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
Outlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/12/23/4864 |
_version_ | 1797400287406718976 |
---|---|
author | Yuehua Huang Wenfen Liu Song Li Ying Guo Wen Chen |
author_facet | Yuehua Huang Wenfen Liu Song Li Ying Guo Wen Chen |
author_sort | Yuehua Huang |
collection | DOAJ |
description | Outlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the following problems: curse of dimensionality, lack of labeled data, and hyperparameter tuning. To address these issues, we present a novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering, called MISC-OD (Mutual Information and reduced Spectral Clustering—Outlier Detection). MISC-OD first constructs a mutual information matrix between features, then, by applying reduced spectral clustering, divides the feature set into subsets, utilizing the LOF (Local Outlier Factor) for outlier detection within each subset and combining the outlier scores found within each subset. Finally, it outputs the outlier score. Our contributions are as follows: (1) we propose a novel outlier detection method called MISC-OD with high interpretability and scalability; (2) numerous experiments on 18 benchmark datasets demonstrate the superior performance of the MISC-OD algorithm compared with eight state-of-the-art baselines in terms of ROC (receiver operating characteristic) and AP (average precision). |
first_indexed | 2024-03-09T01:53:24Z |
format | Article |
id | doaj.art-f2b0b1c83578459694f13d75bfe51c47 |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-09T01:53:24Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-f2b0b1c83578459694f13d75bfe51c472023-12-08T15:14:17ZengMDPI AGElectronics2079-92922023-12-011223486410.3390/electronics12234864A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral ClusteringYuehua Huang0Wenfen Liu1Song Li2Ying Guo3Wen Chen4School of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaOutlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the following problems: curse of dimensionality, lack of labeled data, and hyperparameter tuning. To address these issues, we present a novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering, called MISC-OD (Mutual Information and reduced Spectral Clustering—Outlier Detection). MISC-OD first constructs a mutual information matrix between features, then, by applying reduced spectral clustering, divides the feature set into subsets, utilizing the LOF (Local Outlier Factor) for outlier detection within each subset and combining the outlier scores found within each subset. Finally, it outputs the outlier score. Our contributions are as follows: (1) we propose a novel outlier detection method called MISC-OD with high interpretability and scalability; (2) numerous experiments on 18 benchmark datasets demonstrate the superior performance of the MISC-OD algorithm compared with eight state-of-the-art baselines in terms of ROC (receiver operating characteristic) and AP (average precision).https://www.mdpi.com/2079-9292/12/23/4864outlier detectionunsupervisedmutual informationspectral clustering |
spellingShingle | Yuehua Huang Wenfen Liu Song Li Ying Guo Wen Chen A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering Electronics outlier detection unsupervised mutual information spectral clustering |
title | A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering |
title_full | A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering |
title_fullStr | A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering |
title_full_unstemmed | A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering |
title_short | A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering |
title_sort | novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering |
topic | outlier detection unsupervised mutual information spectral clustering |
url | https://www.mdpi.com/2079-9292/12/23/4864 |
work_keys_str_mv | AT yuehuahuang anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT wenfenliu anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT songli anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT yingguo anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT wenchen anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT yuehuahuang novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT wenfenliu novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT songli novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT yingguo novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering AT wenchen novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering |