A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data

In data analysis, recognizing unusual patterns (outliers’ analysis or anomaly detection) plays a crucial role in identifying critical events. Because of its widespread use in many applications, it remains an important and extensive research brand in data mining. As a result, numerous techniques for...

Full description

Bibliographic Details
Main Authors: Suboh, Syahirah, Abdul Aziz, Izzatdin, Shaharudin, Shazlyn Milleana, Ismail, Saidatul Akmar, Mahdin, Hairulnizam
Format: Article
Language:English
Published: JOIV 2023
Subjects:
Online Access:http://eprints.uthm.edu.my/9150/1/J15862_f3944b7e279a07421e2ed97fc6d397d2.pdf
_version_ 1796869911292674048
author Suboh, Syahirah
Abdul Aziz, Izzatdin
Shaharudin, Shazlyn Milleana
Ismail, Saidatul Akmar
Mahdin, Hairulnizam
author_facet Suboh, Syahirah
Abdul Aziz, Izzatdin
Shaharudin, Shazlyn Milleana
Ismail, Saidatul Akmar
Mahdin, Hairulnizam
author_sort Suboh, Syahirah
collection UTHM
description In data analysis, recognizing unusual patterns (outliers’ analysis or anomaly detection) plays a crucial role in identifying critical events. Because of its widespread use in many applications, it remains an important and extensive research brand in data mining. As a result, numerous techniques for finding anomalies have been developed, and more are still being worked on. Researchers can gain vital knowledge by identifying anomalies, which helps them make better meaningful data analyses. However, anomaly detection is even more challenging when the datasets are high-dimensional and multivariate. In the literature, anomaly detection has received much attention but not as much as anomaly detection, specifically in high dimensional and multivariate conditions. This paper systematically reviews the existing related techniques and presents extensive coverage of challenges and perspectives of anomaly detection within highdimensional and multivariate data. At the same time, it provides a clear insight into the techniques developed for anomaly detection problems. This paper aims to help select the best technique that suits its rightful purpose. It has been found that PCA, DOBIN, Stray algorithm, and DAE-KNN have a high learning rate compared to Random projection, ROBEM, and OCP methods. Overall, most methods have shown an excellent ability to tackle the curse of dimensionality and multivariate features to perform anomaly detection. Moreover, a comparison of each algorithm for anomaly detection is also provided to produce a better algorithm. Finally, it would be a line of future studies to extend by comparing the methods on other domain-specific datasets and offering a comprehensive anomaly interpretation in describing the truth of anomalies.
first_indexed 2024-03-05T22:01:45Z
format Article
id uthm.eprints-9150
institution Universiti Tun Hussein Onn Malaysia
language English
last_indexed 2024-03-05T22:01:45Z
publishDate 2023
publisher JOIV
record_format dspace
spelling uthm.eprints-91502023-07-17T07:24:50Z http://eprints.uthm.edu.my/9150/ A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data Suboh, Syahirah Abdul Aziz, Izzatdin Shaharudin, Shazlyn Milleana Ismail, Saidatul Akmar Mahdin, Hairulnizam T Technology (General) In data analysis, recognizing unusual patterns (outliers’ analysis or anomaly detection) plays a crucial role in identifying critical events. Because of its widespread use in many applications, it remains an important and extensive research brand in data mining. As a result, numerous techniques for finding anomalies have been developed, and more are still being worked on. Researchers can gain vital knowledge by identifying anomalies, which helps them make better meaningful data analyses. However, anomaly detection is even more challenging when the datasets are high-dimensional and multivariate. In the literature, anomaly detection has received much attention but not as much as anomaly detection, specifically in high dimensional and multivariate conditions. This paper systematically reviews the existing related techniques and presents extensive coverage of challenges and perspectives of anomaly detection within highdimensional and multivariate data. At the same time, it provides a clear insight into the techniques developed for anomaly detection problems. This paper aims to help select the best technique that suits its rightful purpose. It has been found that PCA, DOBIN, Stray algorithm, and DAE-KNN have a high learning rate compared to Random projection, ROBEM, and OCP methods. Overall, most methods have shown an excellent ability to tackle the curse of dimensionality and multivariate features to perform anomaly detection. Moreover, a comparison of each algorithm for anomaly detection is also provided to produce a better algorithm. Finally, it would be a line of future studies to extend by comparing the methods on other domain-specific datasets and offering a comprehensive anomaly interpretation in describing the truth of anomalies. JOIV 2023 Article PeerReviewed text en http://eprints.uthm.edu.my/9150/1/J15862_f3944b7e279a07421e2ed97fc6d397d2.pdf Suboh, Syahirah and Abdul Aziz, Izzatdin and Shaharudin, Shazlyn Milleana and Ismail, Saidatul Akmar and Mahdin, Hairulnizam (2023) A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data. INTERNATIONAL JOURNAL ON INFORMATICS VISUALIZATION, 7 (1). pp. 122-130.
spellingShingle T Technology (General)
Suboh, Syahirah
Abdul Aziz, Izzatdin
Shaharudin, Shazlyn Milleana
Ismail, Saidatul Akmar
Mahdin, Hairulnizam
A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title_full A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title_fullStr A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title_full_unstemmed A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title_short A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data
title_sort systematic review of anomaly detection within high dimensional and multivariate data
topic T Technology (General)
url http://eprints.uthm.edu.my/9150/1/J15862_f3944b7e279a07421e2ed97fc6d397d2.pdf
work_keys_str_mv AT subohsyahirah asystematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT abdulazizizzatdin asystematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT shaharudinshazlynmilleana asystematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT ismailsaidatulakmar asystematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT mahdinhairulnizam asystematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT subohsyahirah systematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT abdulazizizzatdin systematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT shaharudinshazlynmilleana systematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT ismailsaidatulakmar systematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata
AT mahdinhairulnizam systematicreviewofanomalydetectionwithinhighdimensionalandmultivariatedata