A literature review on one-class classification and its potential applications in big data
Abstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classifi...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2021-09-01
|
Series: | Journal of Big Data |
Subjects: | |
Online Access: | https://doi.org/10.1186/s40537-021-00514-x |
_version_ | 1818910005532819456 |
---|---|
author | Naeem Seliya Azadeh Abdollah Zadeh Taghi M. Khoshgoftaar |
author_facet | Naeem Seliya Azadeh Abdollah Zadeh Taghi M. Khoshgoftaar |
author_sort | Naeem Seliya |
collection | DOAJ |
description | Abstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classification (OCC) is an approach to detect abnormal data points compared to the instances of the known class and can serve to address issues related to severely imbalanced datasets, which are especially very common in big data. We present a detailed survey of OCC-related literature works published over the last decade, approximately. We group the different works into three categories: outlier detection, novelty detection, and deep learning and OCC. We closely examine and evaluate selected works on OCC such that a good cross section of approaches, methods, and application domains is represented in the survey. Commonly used techniques in OCC for outlier detection and for novelty detection, respectively, are discussed. We observed one area that has been largely omitted in OCC-related literature is its application context for big data and its inherently associated problems, such as severe class imbalance, class rarity, noisy data, feature selection, and data reduction. We feel the survey will be appreciated by researchers working in these areas of big data. |
first_indexed | 2024-12-19T22:35:56Z |
format | Article |
id | doaj.art-80029aa19f134bb587f19a0f9bcaf9c3 |
institution | Directory Open Access Journal |
issn | 2196-1115 |
language | English |
last_indexed | 2024-12-19T22:35:56Z |
publishDate | 2021-09-01 |
publisher | SpringerOpen |
record_format | Article |
series | Journal of Big Data |
spelling | doaj.art-80029aa19f134bb587f19a0f9bcaf9c32022-12-21T20:03:13ZengSpringerOpenJournal of Big Data2196-11152021-09-018113110.1186/s40537-021-00514-xA literature review on one-class classification and its potential applications in big dataNaeem Seliya0Azadeh Abdollah Zadeh1Taghi M. Khoshgoftaar2University of Wisconsin-Eau ClaireFlorida Atlantic UniversityFlorida Atlantic UniversityAbstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classification (OCC) is an approach to detect abnormal data points compared to the instances of the known class and can serve to address issues related to severely imbalanced datasets, which are especially very common in big data. We present a detailed survey of OCC-related literature works published over the last decade, approximately. We group the different works into three categories: outlier detection, novelty detection, and deep learning and OCC. We closely examine and evaluate selected works on OCC such that a good cross section of approaches, methods, and application domains is represented in the survey. Commonly used techniques in OCC for outlier detection and for novelty detection, respectively, are discussed. We observed one area that has been largely omitted in OCC-related literature is its application context for big data and its inherently associated problems, such as severe class imbalance, class rarity, noisy data, feature selection, and data reduction. We feel the survey will be appreciated by researchers working in these areas of big data.https://doi.org/10.1186/s40537-021-00514-xOne-class classificationBig dataOutlier detectionNovelty detectionDeep learningClass imbalance |
spellingShingle | Naeem Seliya Azadeh Abdollah Zadeh Taghi M. Khoshgoftaar A literature review on one-class classification and its potential applications in big data Journal of Big Data One-class classification Big data Outlier detection Novelty detection Deep learning Class imbalance |
title | A literature review on one-class classification and its potential applications in big data |
title_full | A literature review on one-class classification and its potential applications in big data |
title_fullStr | A literature review on one-class classification and its potential applications in big data |
title_full_unstemmed | A literature review on one-class classification and its potential applications in big data |
title_short | A literature review on one-class classification and its potential applications in big data |
title_sort | literature review on one class classification and its potential applications in big data |
topic | One-class classification Big data Outlier detection Novelty detection Deep learning Class imbalance |
url | https://doi.org/10.1186/s40537-021-00514-x |
work_keys_str_mv | AT naeemseliya aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata AT azadehabdollahzadeh aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata AT taghimkhoshgoftaar aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata AT naeemseliya literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata AT azadehabdollahzadeh literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata AT taghimkhoshgoftaar literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata |