Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification

Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to appr...

Full description

Bibliographic Details
Main Authors: Brahim Jabir, Isabel De La Torre Diez, Ernesto Francisco Bautista Thompson, Debora Libertad Ramirez Vargas, Angel Gabriel Kuc Castilla
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10121046/
_version_ 1797821617409097728
author Brahim Jabir
Isabel De La Torre Diez
Ernesto Francisco Bautista Thompson
Debora Libertad Ramirez Vargas
Angel Gabriel Kuc Castilla
author_facet Brahim Jabir
Isabel De La Torre Diez
Ernesto Francisco Bautista Thompson
Debora Libertad Ramirez Vargas
Angel Gabriel Kuc Castilla
author_sort Brahim Jabir
collection DOAJ
description Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an “Ensemble Partition Sampling” method within the “one-vs-all” (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. Based on the findings, it can be concluded that the EPS approach is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
first_indexed 2024-03-13T09:55:26Z
format Article
id doaj.art-3151e1d3bc8d4fb78815263908dc6f20
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-13T09:55:26Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-3151e1d3bc8d4fb78815263908dc6f202023-05-23T23:00:18ZengIEEEIEEE Access2169-35362023-01-0111482214823510.1109/ACCESS.2023.327392510121046Ensemble Partition Sampling (EPS) for Improved Multi-Class ClassificationBrahim Jabir0https://orcid.org/0000-0002-8762-9199Isabel De La Torre Diez1Ernesto Francisco Bautista Thompson2Debora Libertad Ramirez Vargas3Angel Gabriel Kuc Castilla4LIMATI Laboratory, Sultan Moulay Slimane University, Beni Mellal, MoroccoDepartment of Signal Theory and Communications, University of Valladolid, Valladolid, SpainHigher Polytechnic School, Universidad Europea del Atlántico, Santander, SpainHigher Polytechnic School, Universidad Europea del Atlántico, Santander, SpainDepartment of Engineering and Projects, Universidad Internacional Iberoamericana, Campeche, MexicoClassification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an “Ensemble Partition Sampling” method within the “one-vs-all” (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. Based on the findings, it can be concluded that the EPS approach is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.https://ieeexplore.ieee.org/document/10121046/Ensemble partition sampling (EPS)one vs one (OvO)one vs all (OvA)multi-class classificationimbalanced learningmulticlass imbalanced classification
spellingShingle Brahim Jabir
Isabel De La Torre Diez
Ernesto Francisco Bautista Thompson
Debora Libertad Ramirez Vargas
Angel Gabriel Kuc Castilla
Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
IEEE Access
Ensemble partition sampling (EPS)
one vs one (OvO)
one vs all (OvA)
multi-class classification
imbalanced learning
multiclass imbalanced classification
title Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
title_full Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
title_fullStr Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
title_full_unstemmed Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
title_short Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
title_sort ensemble partition sampling eps for improved multi class classification
topic Ensemble partition sampling (EPS)
one vs one (OvO)
one vs all (OvA)
multi-class classification
imbalanced learning
multiclass imbalanced classification
url https://ieeexplore.ieee.org/document/10121046/
work_keys_str_mv AT brahimjabir ensemblepartitionsamplingepsforimprovedmulticlassclassification
AT isabeldelatorrediez ensemblepartitionsamplingepsforimprovedmulticlassclassification
AT ernestofranciscobautistathompson ensemblepartitionsamplingepsforimprovedmulticlassclassification
AT deboralibertadramirezvargas ensemblepartitionsamplingepsforimprovedmulticlassclassification
AT angelgabrielkuccastilla ensemblepartitionsamplingepsforimprovedmulticlassclassification