VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS

The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatment...

Full description

Bibliographic Details
Main Authors: Ahmad Hakiim Jamaluddin, Nor Idayu Mahat
Format: Article
Language:English
Published: UUM Press 2020-11-01
Series:Journal of ICT
Subjects:
Online Access:https://e-journal.uum.edu.my/index.php/jict/article/view/12401
_version_ 1811314712007147520
author Ahmad Hakiim Jamaluddin
Nor Idayu Mahat
author_facet Ahmad Hakiim Jamaluddin
Nor Idayu Mahat
author_sort Ahmad Hakiim Jamaluddin
collection DOAJ
description The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatments to alleviate such curse. Previous studies have attempted to address the effect of a resampling method on the performance of LDA. However, some studies contradicted with each other based on different performance measures as well as validation strategies. This manuscript attempted to shed more light on the effect of a resampling method (ROS or RUS) on the performance of LDA based on true positive rate and true negative rate through five validation strategies, i.e. leave-one-out cross-validation, k-fold cross-validation, repeated k-fold cross-validation, naive bootstrap, and .632+ bootstrap. 100 two-group bivariate normally distributed simulated and four real data sets with severe class imbalance ratio were utilised. The analysis on the location and dispersion statistics of the performance measures was further enlightened on: (i) the effect of a resampling method on the performance of LDA, and (ii) the enhancement in the learning fairness of LDA on objects regardless of sample size, hence reducing the effect of the curse of class imbalance.
first_indexed 2024-04-13T11:17:25Z
format Article
id doaj.art-29bda896486c453d826fd24f222001ab
institution Directory Open Access Journal
issn 1675-414X
2180-3862
language English
last_indexed 2024-04-13T11:17:25Z
publishDate 2020-11-01
publisher UUM Press
record_format Article
series Journal of ICT
spelling doaj.art-29bda896486c453d826fd24f222001ab2022-12-22T02:48:55ZengUUM PressJournal of ICT1675-414X2180-38622020-11-01201VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSISAhmad Hakiim Jamaluddin0Nor Idayu Mahat1Department of Mathematics, Universiti Putra Malaysia, MalaysiaCentre for Testing, Measurement and Appraisal, Universiti Utara Malaysia, Malaysia The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatments to alleviate such curse. Previous studies have attempted to address the effect of a resampling method on the performance of LDA. However, some studies contradicted with each other based on different performance measures as well as validation strategies. This manuscript attempted to shed more light on the effect of a resampling method (ROS or RUS) on the performance of LDA based on true positive rate and true negative rate through five validation strategies, i.e. leave-one-out cross-validation, k-fold cross-validation, repeated k-fold cross-validation, naive bootstrap, and .632+ bootstrap. 100 two-group bivariate normally distributed simulated and four real data sets with severe class imbalance ratio were utilised. The analysis on the location and dispersion statistics of the performance measures was further enlightened on: (i) the effect of a resampling method on the performance of LDA, and (ii) the enhancement in the learning fairness of LDA on objects regardless of sample size, hence reducing the effect of the curse of class imbalance. https://e-journal.uum.edu.my/index.php/jict/article/view/12401Linear discriminant analysispre-processingresampling methodclass imbalancebinary classification
spellingShingle Ahmad Hakiim Jamaluddin
Nor Idayu Mahat
VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
Journal of ICT
Linear discriminant analysis
pre-processing
resampling method
class imbalance
binary classification
title VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
title_full VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
title_fullStr VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
title_full_unstemmed VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
title_short VALIDATION ASSESSMENTS ON RESAMPLING METHOD IN IMBALANCED BINARY CLASSIFICATION FOR LINEAR DISCRIMINANT ANALYSIS
title_sort validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
topic Linear discriminant analysis
pre-processing
resampling method
class imbalance
binary classification
url https://e-journal.uum.edu.my/index.php/jict/article/view/12401
work_keys_str_mv AT ahmadhakiimjamaluddin validationassessmentsonresamplingmethodinimbalancedbinaryclassificationforlineardiscriminantanalysis
AT noridayumahat validationassessmentsonresamplingmethodinimbalancedbinaryclassificationforlineardiscriminantanalysis