XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function

For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to mi...

Full description

Bibliographic Details
Main Author: LI Jing-tai, WANG Xiao-dan
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-05-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdf
Description
Summary:For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to misclassification of minority.By constructing cost-sensitive activation function (CSAF),samples in different estimation are under different gradient variations,which approach the problem that the gradient variation of misclassified minority sample is too small to make samples be recognized correctly in iterations.The experiments analyze the relation of imbalanced rate (IR) to parameters,and compare performance with SMOTE-XGBoost,ADASYN-XGBoost,Focal loss-XGBoost and Weight-XGBoost on UCI datasets.As for recall rate of minority,CSAF-XGBoost surpasses the best methods 6.75% in average and 15%in maximum with F1-score and AUC score in the same level.The results prove CSAF-XGBoost has better performance in recognizing minority class samples and wider applicability.
ISSN:1002-137X