Information Theoretic Methods for Variable Selection—A Review
We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on var...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/24/8/1079 |
_version_ | 1797445593982828544 |
---|---|
author | Jan Mielniczuk |
author_facet | Jan Mielniczuk |
author_sort | Jan Mielniczuk |
collection | DOAJ |
description | We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling. |
first_indexed | 2024-03-09T13:29:07Z |
format | Article |
id | doaj.art-4abbdf70fede429e8baf7593c780a67d |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-09T13:29:07Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-4abbdf70fede429e8baf7593c780a67d2023-11-30T21:20:21ZengMDPI AGEntropy1099-43002022-08-01248107910.3390/e24081079Information Theoretic Methods for Variable Selection—A ReviewJan Mielniczuk0Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, PolandWe review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.https://www.mdpi.com/1099-4300/24/8/1079conditional independenceinteraction informationMöbius expansionMarkov blanketfeature selection |
spellingShingle | Jan Mielniczuk Information Theoretic Methods for Variable Selection—A Review Entropy conditional independence interaction information Möbius expansion Markov blanket feature selection |
title | Information Theoretic Methods for Variable Selection—A Review |
title_full | Information Theoretic Methods for Variable Selection—A Review |
title_fullStr | Information Theoretic Methods for Variable Selection—A Review |
title_full_unstemmed | Information Theoretic Methods for Variable Selection—A Review |
title_short | Information Theoretic Methods for Variable Selection—A Review |
title_sort | information theoretic methods for variable selection a review |
topic | conditional independence interaction information Möbius expansion Markov blanket feature selection |
url | https://www.mdpi.com/1099-4300/24/8/1079 |
work_keys_str_mv | AT janmielniczuk informationtheoreticmethodsforvariableselectionareview |