XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function
For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to mi...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial office of Computer Science
2022-05-01
|
Series: | Jisuanji kexue |
Subjects: | |
Online Access: | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdf |
_version_ | 1797845074079383552 |
---|---|
author | LI Jing-tai, WANG Xiao-dan |
author_facet | LI Jing-tai, WANG Xiao-dan |
author_sort | LI Jing-tai, WANG Xiao-dan |
collection | DOAJ |
description | For binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to misclassification of minority.By constructing cost-sensitive activation function (CSAF),samples in different estimation are under different gradient variations,which approach the problem that the gradient variation of misclassified minority sample is too small to make samples be recognized correctly in iterations.The experiments analyze the relation of imbalanced rate (IR) to parameters,and compare performance with SMOTE-XGBoost,ADASYN-XGBoost,Focal loss-XGBoost and Weight-XGBoost on UCI datasets.As for recall rate of minority,CSAF-XGBoost surpasses the best methods 6.75% in average and 15%in maximum with F1-score and AUC score in the same level.The results prove CSAF-XGBoost has better performance in recognizing minority class samples and wider applicability. |
first_indexed | 2024-04-09T17:32:37Z |
format | Article |
id | doaj.art-9117ef5a6d3642b4b1cbad44cdb09f1f |
institution | Directory Open Access Journal |
issn | 1002-137X |
language | zho |
last_indexed | 2024-04-09T17:32:37Z |
publishDate | 2022-05-01 |
publisher | Editorial office of Computer Science |
record_format | Article |
series | Jisuanji kexue |
spelling | doaj.art-9117ef5a6d3642b4b1cbad44cdb09f1f2023-04-18T02:35:57ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2022-05-0149513514310.11896/jsjkx.210400064XGBoost for Imbalanced Data Based on Cost-sensitive Activation FunctionLI Jing-tai, WANG Xiao-dan0Air and Missile Defense College,Air Force Engineering University,Xi’an 710051,ChinaFor binary classification with category imbalance,acost-sensitive activation function XGBoost algorithm(CSAF-XGBoost) is proposed to promote the ability of recognizing minority samples.When XGBoost algorithm constructs decision trees,unbalanced data will affect split point selection,which lead to misclassification of minority.By constructing cost-sensitive activation function (CSAF),samples in different estimation are under different gradient variations,which approach the problem that the gradient variation of misclassified minority sample is too small to make samples be recognized correctly in iterations.The experiments analyze the relation of imbalanced rate (IR) to parameters,and compare performance with SMOTE-XGBoost,ADASYN-XGBoost,Focal loss-XGBoost and Weight-XGBoost on UCI datasets.As for recall rate of minority,CSAF-XGBoost surpasses the best methods 6.75% in average and 15%in maximum with F1-score and AUC score in the same level.The results prove CSAF-XGBoost has better performance in recognizing minority class samples and wider applicability.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdfcost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function |
spellingShingle | LI Jing-tai, WANG Xiao-dan XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function Jisuanji kexue cost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function |
title | XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function |
title_full | XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function |
title_fullStr | XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function |
title_full_unstemmed | XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function |
title_short | XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function |
title_sort | xgboost for imbalanced data based on cost sensitive activation function |
topic | cost-sensitive|logistic regression|data imbalanced classification|xgboost|activation function |
url | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-5-135.pdf |
work_keys_str_mv | AT lijingtaiwangxiaodan xgboostforimbalanceddatabasedoncostsensitiveactivationfunction |