Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion

Binary imbalance problem refers to such a classification scenario where one class contains a large number of samples while another class contains only a few samples. When traditional classifiers face with imbalanced datasets, they usually bias towards majority class resulting in poor classification...

Full description

Bibliographic Details
Main Authors: Junhai Zhai, Jiaxing Qi, Sufang Zhang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9195865/
_version_ 1818936251423653888
author Junhai Zhai
Jiaxing Qi
Sufang Zhang
author_facet Junhai Zhai
Jiaxing Qi
Sufang Zhang
author_sort Junhai Zhai
collection DOAJ
description Binary imbalance problem refers to such a classification scenario where one class contains a large number of samples while another class contains only a few samples. When traditional classifiers face with imbalanced datasets, they usually bias towards majority class resulting in poor classification performance. Oversampling is an effective method to address this problem, yet how to conduct diversity oversampling is a challenge. In this article, we proposed a diversity oversampling method based on a modified D2GAN model, and on the basis of diversity oversampling, we also proposed a binary imbalanced data classification approach based on classifier fusion by fuzzy integral. Extensive experiments are conducted on 8 data sets to compare the proposed methods with 7 state-of-the-art methods on 5 aspects: MMD-score, Silhouette-score, F-measure, G-means, and AUC-area. The 7 methods include 4 SMOTE related approaches and 3 GAN related approaches. The experimental results demonstrate that the proposed methods are more effective and efficient than the compared approaches.
first_indexed 2024-12-20T05:33:06Z
format Article
id doaj.art-3248186172f54b829d379f64d038db41
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T05:33:06Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-3248186172f54b829d379f64d038db412022-12-21T19:51:41ZengIEEEIEEE Access2169-35362020-01-01816945616946910.1109/ACCESS.2020.30239499195865Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier FusionJunhai Zhai0https://orcid.org/0000-0001-9962-7417Jiaxing Qi1Sufang Zhang2https://orcid.org/0000-0002-7585-6490Hebei Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding, ChinaHebei Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding, ChinaHebei Branch of China Meteorological Administration Training Center, China Meteorological Administration, Baoding, ChinaBinary imbalance problem refers to such a classification scenario where one class contains a large number of samples while another class contains only a few samples. When traditional classifiers face with imbalanced datasets, they usually bias towards majority class resulting in poor classification performance. Oversampling is an effective method to address this problem, yet how to conduct diversity oversampling is a challenge. In this article, we proposed a diversity oversampling method based on a modified D2GAN model, and on the basis of diversity oversampling, we also proposed a binary imbalanced data classification approach based on classifier fusion by fuzzy integral. Extensive experiments are conducted on 8 data sets to compare the proposed methods with 7 state-of-the-art methods on 5 aspects: MMD-score, Silhouette-score, F-measure, G-means, and AUC-area. The 7 methods include 4 SMOTE related approaches and 3 GAN related approaches. The experimental results demonstrate that the proposed methods are more effective and efficient than the compared approaches.https://ieeexplore.ieee.org/document/9195865/Binary class imbalancediversity oversamplinggenerative adversarial networkclassifier fusionfuzzy integral
spellingShingle Junhai Zhai
Jiaxing Qi
Sufang Zhang
Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
IEEE Access
Binary class imbalance
diversity oversampling
generative adversarial network
classifier fusion
fuzzy integral
title Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
title_full Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
title_fullStr Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
title_full_unstemmed Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
title_short Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion
title_sort binary imbalanced data classification based on modified d2gan oversampling and classifier fusion
topic Binary class imbalance
diversity oversampling
generative adversarial network
classifier fusion
fuzzy integral
url https://ieeexplore.ieee.org/document/9195865/
work_keys_str_mv AT junhaizhai binaryimbalanceddataclassificationbasedonmodifiedd2ganoversamplingandclassifierfusion
AT jiaxingqi binaryimbalanceddataclassificationbasedonmodifiedd2ganoversamplingandclassifierfusion
AT sufangzhang binaryimbalanceddataclassificationbasedonmodifiedd2ganoversamplingandclassifierfusion