Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to appr...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10121046/ |
_version_ | 1797821617409097728 |
---|---|
author | Brahim Jabir Isabel De La Torre Diez Ernesto Francisco Bautista Thompson Debora Libertad Ramirez Vargas Angel Gabriel Kuc Castilla |
author_facet | Brahim Jabir Isabel De La Torre Diez Ernesto Francisco Bautista Thompson Debora Libertad Ramirez Vargas Angel Gabriel Kuc Castilla |
author_sort | Brahim Jabir |
collection | DOAJ |
description | Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an “Ensemble Partition Sampling” method within the “one-vs-all” (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. Based on the findings, it can be concluded that the EPS approach is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets. |
first_indexed | 2024-03-13T09:55:26Z |
format | Article |
id | doaj.art-3151e1d3bc8d4fb78815263908dc6f20 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-13T09:55:26Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-3151e1d3bc8d4fb78815263908dc6f202023-05-23T23:00:18ZengIEEEIEEE Access2169-35362023-01-0111482214823510.1109/ACCESS.2023.327392510121046Ensemble Partition Sampling (EPS) for Improved Multi-Class ClassificationBrahim Jabir0https://orcid.org/0000-0002-8762-9199Isabel De La Torre Diez1Ernesto Francisco Bautista Thompson2Debora Libertad Ramirez Vargas3Angel Gabriel Kuc Castilla4LIMATI Laboratory, Sultan Moulay Slimane University, Beni Mellal, MoroccoDepartment of Signal Theory and Communications, University of Valladolid, Valladolid, SpainHigher Polytechnic School, Universidad Europea del Atlántico, Santander, SpainHigher Polytechnic School, Universidad Europea del Atlántico, Santander, SpainDepartment of Engineering and Projects, Universidad Internacional Iberoamericana, Campeche, MexicoClassification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an “Ensemble Partition Sampling” method within the “one-vs-all” (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. Based on the findings, it can be concluded that the EPS approach is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.https://ieeexplore.ieee.org/document/10121046/Ensemble partition sampling (EPS)one vs one (OvO)one vs all (OvA)multi-class classificationimbalanced learningmulticlass imbalanced classification |
spellingShingle | Brahim Jabir Isabel De La Torre Diez Ernesto Francisco Bautista Thompson Debora Libertad Ramirez Vargas Angel Gabriel Kuc Castilla Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification IEEE Access Ensemble partition sampling (EPS) one vs one (OvO) one vs all (OvA) multi-class classification imbalanced learning multiclass imbalanced classification |
title | Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification |
title_full | Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification |
title_fullStr | Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification |
title_full_unstemmed | Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification |
title_short | Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification |
title_sort | ensemble partition sampling eps for improved multi class classification |
topic | Ensemble partition sampling (EPS) one vs one (OvO) one vs all (OvA) multi-class classification imbalanced learning multiclass imbalanced classification |
url | https://ieeexplore.ieee.org/document/10121046/ |
work_keys_str_mv | AT brahimjabir ensemblepartitionsamplingepsforimprovedmulticlassclassification AT isabeldelatorrediez ensemblepartitionsamplingepsforimprovedmulticlassclassification AT ernestofranciscobautistathompson ensemblepartitionsamplingepsforimprovedmulticlassclassification AT deboralibertadramirezvargas ensemblepartitionsamplingepsforimprovedmulticlassclassification AT angelgabrielkuccastilla ensemblepartitionsamplingepsforimprovedmulticlassclassification |