Simple Stopping Criteria for Information Theoretic Feature Selection
Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-01-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/21/1/99 |
_version_ | 1811213371303788544 |
---|---|
author | Shujian Yu José C. Príncipe |
author_facet | Shujian Yu José C. Príncipe |
author_sort | Shujian Yu |
collection | DOAJ |
description | Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s <inline-formula> <math display="inline"> <semantics> <mi>α</mi></semantics></math></inline-formula>-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy. |
first_indexed | 2024-04-12T05:44:33Z |
format | Article |
id | doaj.art-c4c8835c4ba24df8a8d5d281530206a2 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-04-12T05:44:33Z |
publishDate | 2019-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-c4c8835c4ba24df8a8d5d281530206a22022-12-22T03:45:29ZengMDPI AGEntropy1099-43002019-01-012119910.3390/e21010099e21010099Simple Stopping Criteria for Information Theoretic Feature SelectionShujian Yu0José C. Príncipe1Computational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USAComputational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USAFeature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s <inline-formula> <math display="inline"> <semantics> <mi>α</mi></semantics></math></inline-formula>-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy.https://www.mdpi.com/1099-4300/21/1/99feature selectionstopping criterionconditional mutual informationmultivariate matrix-based Rényi’s α-entropy functional |
spellingShingle | Shujian Yu José C. Príncipe Simple Stopping Criteria for Information Theoretic Feature Selection Entropy feature selection stopping criterion conditional mutual information multivariate matrix-based Rényi’s α-entropy functional |
title | Simple Stopping Criteria for Information Theoretic Feature Selection |
title_full | Simple Stopping Criteria for Information Theoretic Feature Selection |
title_fullStr | Simple Stopping Criteria for Information Theoretic Feature Selection |
title_full_unstemmed | Simple Stopping Criteria for Information Theoretic Feature Selection |
title_short | Simple Stopping Criteria for Information Theoretic Feature Selection |
title_sort | simple stopping criteria for information theoretic feature selection |
topic | feature selection stopping criterion conditional mutual information multivariate matrix-based Rényi’s α-entropy functional |
url | https://www.mdpi.com/1099-4300/21/1/99 |
work_keys_str_mv | AT shujianyu simplestoppingcriteriaforinformationtheoreticfeatureselection AT josecprincipe simplestoppingcriteriaforinformationtheoreticfeatureselection |