XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function

For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to mi...

Full description

Bibliographic Details
Main Author: LI Jing-tai, WANG Xiao-dan
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-05-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdf
_version_ 1797845074079383552
author LI Jing-tai, WANG Xiao-dan
author_facet LI Jing-tai, WANG Xiao-dan
author_sort LI Jing-tai, WANG Xiao-dan
collection DOAJ
description For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to misclassification of minority.By constructing cost-sensitive activation function (CSAF),samples in different estimation are under different gradient variations,which approach the problem that the gradient variation of misclassified minority sample is too small to make samples be recognized correctly in iterations.The experiments analyze the relation of imbalanced rate (IR) to parameters,and compare performance with SMOTE-XGBoost,ADASYN-XGBoost,Focal loss-XGBoost and Weight-XGBoost on UCI datasets.As for recall rate of minority,CSAF-XGBoost surpasses the best methods 6.75% in average and 15%in maximum with F1-score and AUC score in the same level.The results prove CSAF-XGBoost has better performance in recognizing minority class samples and wider applicability.
first_indexed 2024-04-09T17:32:37Z
format Article
id doaj.art-9117ef5a6d3642b4b1cbad44cdb09f1f
institution Directory Open Access Journal
issn 1002-137X
language zho
last_indexed 2024-04-09T17:32:37Z
publishDate 2022-05-01
publisher Editorial office of Computer Science
record_format Article
series Jisuanji kexue
spelling doaj.art-9117ef5a6d3642b4b1cbad44cdb09f1f2023-04-18T02:35:57ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2022-05-0149513514310.11896/jsjkx.210400064XGBoost for Imbalanced Data Based on Cost-sensitive Activation FunctionLI Jing-tai, WANG Xiao-dan0Air and Missile Defense College,Air Force Engineering University,Xi’an 710051,ChinaFor binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to misclassification of minority.By constructing cost-sensitive activation function (CSAF),samples in different estimation are under different gradient variations,which approach the problem that the gradient variation of misclassified minority sample is too small to make samples be recognized correctly in iterations.The experiments analyze the relation of imbalanced rate (IR) to parameters,and compare performance with SMOTE-XGBoost,ADASYN-XGBoost,Focal loss-XGBoost and Weight-XGBoost on UCI datasets.As for recall rate of minority,CSAF-XGBoost surpasses the best methods 6.75% in average and 15%in maximum with F1-score and AUC score in the same level.The results prove CSAF-XGBoost has better performance in recognizing minority class samples and wider applicability.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdfcost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function
spellingShingle LI Jing-tai, WANG Xiao-dan
XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
Jisuanji kexue
cost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function
title XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
title_full XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
title_fullStr XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
title_full_unstemmed XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
title_short XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
title_sort xgboost for imbalanced data based on cost sensitive activation function
topic cost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function
url https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdf
work_keys_str_mv AT lijingtaiwangxiaodan xgboostforimbalanceddatabasedoncostsensitiveactivationfunction