Attribute Reduction Algorithm for Incomplete Information Systems Based on Intuitive Fuzzy Pairs

The current attribute reduction algorithms for information systems are difficult to handle imbalanced data with default values. Therefore, to address the shortcomings of traditional attribute reduction algorithms (ARAs) in incomplete information systems, a new algorithm is proposed by introducing in...

Full description

Bibliographic Details
Main Authors: Weihan Li, Jianwei Guo
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10210416/
Description
Summary:The current attribute reduction algorithms for information systems are difficult to handle imbalanced data with default values. Therefore, to address the shortcomings of traditional attribute reduction algorithms (ARAs) in incomplete information systems, a new algorithm is proposed by introducing intuitive fuzzy pairs (IFP). In addition, a composite minority oversampling technique TampC and Central Limit SMOTE (TampC-CL-SMOTE) is proposed to improve the pre-data sampling method of the algorithm, and its effectiveness is verified by experiments. The experimental results show that the average classification accuracy of the improved attribute reduction algorithm on the naive Bayes classifier is 82.13%, and the average classification accuracy on the support vector machine classifier is 86.48%. In the comparison of operational efficiency, the average running time of the improved attribute reduction algorithm is 5.92 seconds, and the overall consumption of running time is lower than that of the comparison algorithm. Meanwhile, the average accuracy, recall, and F-measure of the algorithm are 76.14%, 78.35%, and 77.19%, respectively. In addition, the G-means of TampC-CL-SMOTE are 2.9% and 5.3% higher than the comparison algorithm, respectively. Overall, the improved attribute reduction algorithm has high efficiency in handling imbalanced data, while the optimization of TampC-CL-SMOTE has effectiveness in practical applications and has advantages in handling high and low imbalanced data in incomplete information environments.
ISSN:2169-3536