Relative Density-Based Intuitionistic Fuzzy SVM for Class Imbalance Learning

The support vector machine (SVM) has been combined with the intuitionistic fuzzy set to suppress the negative impact of noises and outliers in classification. However, it has some inherent defects, resulting in the inaccurate prior distribution estimation for datasets, especially the imbalanced data...

Full description

Bibliographic Details
Main Authors: Cui Fu, Shuisheng Zhou, Dan Zhang, Li Chen
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/1/34
Description
Summary:The support vector machine (SVM) has been combined with the intuitionistic fuzzy set to suppress the negative impact of noises and outliers in classification. However, it has some inherent defects, resulting in the inaccurate prior distribution estimation for datasets, especially the imbalanced datasets with non-normally distributed data, further reducing the performance of the classification model for imbalance learning. To solve these problems, we propose a novel relative density-based intuitionistic fuzzy support vector machine (RIFSVM) algorithm for imbalanced learning in the presence of noise and outliers. In our proposed algorithm, the relative density, which is estimated by adopting the k-nearest-neighbor distances, is used to calculate the intuitionistic fuzzy numbers. The fuzzy values of the majority class instances are designed by multiplying the score function of the intuitionistic fuzzy number by the imbalance ratio, and the fuzzy values of minority class instances are assigned the intuitionistic fuzzy membership degree. With the help of the strong capture ability of the relative density to prior information and the strong recognition ability of the intuitionistic fuzzy score function to noises and outliers, the proposed RIFSVM not only reduces the influence of class imbalance but also suppresses the impact of noises and outliers, and further improves the classification performance. Experiments on the synthetic and public imbalanced datasets show that our approach has better performance in terms of G-Means, F-Measures, and AUC than the other class imbalance classification algorithms.
ISSN:1099-4300