Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified sam...

Full description

Bibliographic Details
Main Authors: Kewen Li, Guangyue Zhou, Jiannan Zhai, Fulai Li, Mingwen Shao
Format: Article
Language:English
Published: MDPI AG 2019-03-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/19/6/1476
_version_ 1817990313422094336
author Kewen Li
Guangyue Zhou
Jiannan Zhai
Fulai Li
Mingwen Shao
author_facet Kewen Li
Guangyue Zhou
Jiannan Zhai
Fulai Li
Mingwen Shao
author_sort Kewen Li
collection DOAJ
description The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.
first_indexed 2024-04-14T00:57:48Z
format Article
id doaj.art-ce0079984964460d99946c5d74854db2
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-04-14T00:57:48Z
publishDate 2019-03-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-ce0079984964460d99946c5d74854db22022-12-22T02:21:33ZengMDPI AGSensors1424-82202019-03-01196147610.3390/s19061476s19061476Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced DataKewen Li0Guangyue Zhou1Jiannan Zhai2Fulai Li3Mingwen Shao4College of Computer and Communication Engineering, China University of Petroleum, Qingdao 266580, Shandong, ChinaCollege of Computer and Communication Engineering, China University of Petroleum, Qingdao 266580, Shandong, ChinaInstitute for Sensing and Embedded Network Systems Engineering, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USASchool of Geosciences, China University of Petroleum, Qingdao 266580, Shandong, ChinaCollege of Computer and Communication Engineering, China University of Petroleum, Qingdao 266580, Shandong, ChinaThe Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.https://www.mdpi.com/1424-8220/19/6/1476Adaptive Boostingimbalanced dataArea Under CurveParticle Swarm Optimization
spellingShingle Kewen Li
Guangyue Zhou
Jiannan Zhai
Fulai Li
Mingwen Shao
Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
Sensors
Adaptive Boosting
imbalanced data
Area Under Curve
Particle Swarm Optimization
title Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
title_full Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
title_fullStr Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
title_full_unstemmed Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
title_short Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
title_sort improved pso adaboost ensemble algorithm for imbalanced data
topic Adaptive Boosting
imbalanced data
Area Under Curve
Particle Swarm Optimization
url https://www.mdpi.com/1424-8220/19/6/1476
work_keys_str_mv AT kewenli improvedpsoadaboostensemblealgorithmforimbalanceddata
AT guangyuezhou improvedpsoadaboostensemblealgorithmforimbalanceddata
AT jiannanzhai improvedpsoadaboostensemblealgorithmforimbalanceddata
AT fulaili improvedpsoadaboostensemblealgorithmforimbalanceddata
AT mingwenshao improvedpsoadaboostensemblealgorithmforimbalanceddata