Summary: | The curse of dimensionality and over-fitting problems are usually associated with high-dimensional data. Feature
selection is one method that can overcome these problems. This paper proposes floating search and conditional independence
testing as a causal feature selection algorithm (FSCI). FSCI uses mutual information with floating search strategy to eliminate
irrelevant features and removes redundant features using conditional independence testing. The experimental demonstration is
based on 8 datasets and the results are evaluated by number of selected features, classification accuracy, and complexity of the
algorithm. The results are compared with the non-causal feature selection algorithms FCBF, ReliefF, and with the causal feature
selection algorithms MMPC, IAMB, FBED and MMMB. The overall results show that the average number of features selected
by the proposed FSCI algorithm (12.8) is below those with ReliefF (16.5) and MMMB (13) algorithms. According to the
classification tests, FSCI algorithm provided the highest average accuracy (87.40%) among the feature selection methods tested.
Moreover, FSCI can infer causality with less complexity.
|