A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering

Outlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the...

Full description

Bibliographic Details
Main Authors: Yuehua Huang, Wenfen Liu, Song Li, Ying Guo, Wen Chen
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/23/4864
_version_ 1797400287406718976
author Yuehua Huang
Wenfen Liu
Song Li
Ying Guo
Wen Chen
author_facet Yuehua Huang
Wenfen Liu
Song Li
Ying Guo
Wen Chen
author_sort Yuehua Huang
collection DOAJ
description Outlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the following problems: curse of dimensionality, lack of labeled data, and hyperparameter tuning. To address these issues, we present a novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering, called MISC-OD (Mutual Information and reduced Spectral Clustering—Outlier Detection). MISC-OD first constructs a mutual information matrix between features, then, by applying reduced spectral clustering, divides the feature set into subsets, utilizing the LOF (Local Outlier Factor) for outlier detection within each subset and combining the outlier scores found within each subset. Finally, it outputs the outlier score. Our contributions are as follows: (1) we propose a novel outlier detection method called MISC-OD with high interpretability and scalability; (2) numerous experiments on 18 benchmark datasets demonstrate the superior performance of the MISC-OD algorithm compared with eight state-of-the-art baselines in terms of ROC (receiver operating characteristic) and AP (average precision).
first_indexed 2024-03-09T01:53:24Z
format Article
id doaj.art-f2b0b1c83578459694f13d75bfe51c47
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-09T01:53:24Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-f2b0b1c83578459694f13d75bfe51c472023-12-08T15:14:17ZengMDPI AGElectronics2079-92922023-12-011223486410.3390/electronics12234864A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral ClusteringYuehua Huang0Wenfen Liu1Song Li2Ying Guo3Wen Chen4School of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security & School of Software Engineering, Guilin University of Electronic Technology, Guilin 541004, ChinaOutlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the following problems: curse of dimensionality, lack of labeled data, and hyperparameter tuning. To address these issues, we present a novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering, called MISC-OD (Mutual Information and reduced Spectral Clustering—Outlier Detection). MISC-OD first constructs a mutual information matrix between features, then, by applying reduced spectral clustering, divides the feature set into subsets, utilizing the LOF (Local Outlier Factor) for outlier detection within each subset and combining the outlier scores found within each subset. Finally, it outputs the outlier score. Our contributions are as follows: (1) we propose a novel outlier detection method called MISC-OD with high interpretability and scalability; (2) numerous experiments on 18 benchmark datasets demonstrate the superior performance of the MISC-OD algorithm compared with eight state-of-the-art baselines in terms of ROC (receiver operating characteristic) and AP (average precision).https://www.mdpi.com/2079-9292/12/23/4864outlier detectionunsupervisedmutual informationspectral clustering
spellingShingle Yuehua Huang
Wenfen Liu
Song Li
Ying Guo
Wen Chen
A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
Electronics
outlier detection
unsupervised
mutual information
spectral clustering
title A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
title_full A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
title_fullStr A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
title_full_unstemmed A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
title_short A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering
title_sort novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering
topic outlier detection
unsupervised
mutual information
spectral clustering
url https://www.mdpi.com/2079-9292/12/23/4864
work_keys_str_mv AT yuehuahuang anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT wenfenliu anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT songli anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT yingguo anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT wenchen anovelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT yuehuahuang novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT wenfenliu novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT songli novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT yingguo novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering
AT wenchen novelunsupervisedoutlierdetectionalgorithmbasedonmutualinformationandreducedspectralclustering