A literature review on one-class classification and its potential applications in big data

Abstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classifi...

Full description

Bibliographic Details
Main Authors: Naeem Seliya, Azadeh Abdollah Zadeh, Taghi M. Khoshgoftaar
Format: Article
Language:English
Published: SpringerOpen 2021-09-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-021-00514-x
_version_ 1818910005532819456
author Naeem Seliya
Azadeh Abdollah Zadeh
Taghi M. Khoshgoftaar
author_facet Naeem Seliya
Azadeh Abdollah Zadeh
Taghi M. Khoshgoftaar
author_sort Naeem Seliya
collection DOAJ
description Abstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classification (OCC) is an approach to detect abnormal data points compared to the instances of the known class and can serve to address issues related to severely imbalanced datasets, which are especially very common in big data. We present a detailed survey of OCC-related literature works published over the last decade, approximately. We group the different works into three categories: outlier detection, novelty detection, and deep learning and OCC. We closely examine and evaluate selected works on OCC such that a good cross section of approaches, methods, and application domains is represented in the survey. Commonly used techniques in OCC for outlier detection and for novelty detection, respectively, are discussed. We observed one area that has been largely omitted in OCC-related literature is its application context for big data and its inherently associated problems, such as severe class imbalance, class rarity, noisy data, feature selection, and data reduction. We feel the survey will be appreciated by researchers working in these areas of big data.
first_indexed 2024-12-19T22:35:56Z
format Article
id doaj.art-80029aa19f134bb587f19a0f9bcaf9c3
institution Directory Open Access Journal
issn 2196-1115
language English
last_indexed 2024-12-19T22:35:56Z
publishDate 2021-09-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj.art-80029aa19f134bb587f19a0f9bcaf9c32022-12-21T20:03:13ZengSpringerOpenJournal of Big Data2196-11152021-09-018113110.1186/s40537-021-00514-xA literature review on one-class classification and its potential applications in big dataNaeem Seliya0Azadeh Abdollah Zadeh1Taghi M. Khoshgoftaar2University of Wisconsin-Eau ClaireFlorida Atlantic UniversityFlorida Atlantic UniversityAbstract In severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classification (OCC) is an approach to detect abnormal data points compared to the instances of the known class and can serve to address issues related to severely imbalanced datasets, which are especially very common in big data. We present a detailed survey of OCC-related literature works published over the last decade, approximately. We group the different works into three categories: outlier detection, novelty detection, and deep learning and OCC. We closely examine and evaluate selected works on OCC such that a good cross section of approaches, methods, and application domains is represented in the survey. Commonly used techniques in OCC for outlier detection and for novelty detection, respectively, are discussed. We observed one area that has been largely omitted in OCC-related literature is its application context for big data and its inherently associated problems, such as severe class imbalance, class rarity, noisy data, feature selection, and data reduction. We feel the survey will be appreciated by researchers working in these areas of big data.https://doi.org/10.1186/s40537-021-00514-xOne-class classificationBig dataOutlier detectionNovelty detectionDeep learningClass imbalance
spellingShingle Naeem Seliya
Azadeh Abdollah Zadeh
Taghi M. Khoshgoftaar
A literature review on one-class classification and its potential applications in big data
Journal of Big Data
One-class classification
Big data
Outlier detection
Novelty detection
Deep learning
Class imbalance
title A literature review on one-class classification and its potential applications in big data
title_full A literature review on one-class classification and its potential applications in big data
title_fullStr A literature review on one-class classification and its potential applications in big data
title_full_unstemmed A literature review on one-class classification and its potential applications in big data
title_short A literature review on one-class classification and its potential applications in big data
title_sort literature review on one class classification and its potential applications in big data
topic One-class classification
Big data
Outlier detection
Novelty detection
Deep learning
Class imbalance
url https://doi.org/10.1186/s40537-021-00514-x
work_keys_str_mv AT naeemseliya aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata
AT azadehabdollahzadeh aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata
AT taghimkhoshgoftaar aliteraturereviewononeclassclassificationanditspotentialapplicationsinbigdata
AT naeemseliya literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata
AT azadehabdollahzadeh literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata
AT taghimkhoshgoftaar literaturereviewononeclassclassificationanditspotentialapplicationsinbigdata