Simple Stopping Criteria for Information Theoretic Feature Selection

Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is...

Full description

Bibliographic Details
Main Authors: Shujian Yu, José C. Príncipe
Format: Article
Language:English
Published: MDPI AG 2019-01-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/21/1/99
_version_ 1811213371303788544
author Shujian Yu
José C. Príncipe
author_facet Shujian Yu
José C. Príncipe
author_sort Shujian Yu
collection DOAJ
description Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s <inline-formula> <math display="inline"> <semantics> <mi>α</mi></semantics></math></inline-formula>-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy.
first_indexed 2024-04-12T05:44:33Z
format Article
id doaj.art-c4c8835c4ba24df8a8d5d281530206a2
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-12T05:44:33Z
publishDate 2019-01-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-c4c8835c4ba24df8a8d5d281530206a22022-12-22T03:45:29ZengMDPI AGEntropy1099-43002019-01-012119910.3390/e21010099e21010099Simple Stopping Criteria for Information Theoretic Feature SelectionShujian Yu0José C. Príncipe1Computational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USAComputational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USAFeature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s <inline-formula> <math display="inline"> <semantics> <mi>α</mi></semantics></math></inline-formula>-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy.https://www.mdpi.com/1099-4300/21/1/99feature selectionstopping criterionconditional mutual informationmultivariate matrix-based Rényi’s α-entropy functional
spellingShingle Shujian Yu
José C. Príncipe
Simple Stopping Criteria for Information Theoretic Feature Selection
Entropy
feature selection
stopping criterion
conditional mutual information
multivariate matrix-based Rényi’s α-entropy functional
title Simple Stopping Criteria for Information Theoretic Feature Selection
title_full Simple Stopping Criteria for Information Theoretic Feature Selection
title_fullStr Simple Stopping Criteria for Information Theoretic Feature Selection
title_full_unstemmed Simple Stopping Criteria for Information Theoretic Feature Selection
title_short Simple Stopping Criteria for Information Theoretic Feature Selection
title_sort simple stopping criteria for information theoretic feature selection
topic feature selection
stopping criterion
conditional mutual information
multivariate matrix-based Rényi’s α-entropy functional
url https://www.mdpi.com/1099-4300/21/1/99
work_keys_str_mv AT shujianyu simplestoppingcriteriaforinformationtheoreticfeatureselection
AT josecprincipe simplestoppingcriteriaforinformationtheoreticfeatureselection