Information Theoretic Methods for Variable Selection—A Review

We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on var...

Full description

Bibliographic Details
Main Author: Jan Mielniczuk
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/24/8/1079
_version_ 1797445593982828544
author Jan Mielniczuk
author_facet Jan Mielniczuk
author_sort Jan Mielniczuk
collection DOAJ
description We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.
first_indexed 2024-03-09T13:29:07Z
format Article
id doaj.art-4abbdf70fede429e8baf7593c780a67d
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-09T13:29:07Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-4abbdf70fede429e8baf7593c780a67d2023-11-30T21:20:21ZengMDPI AGEntropy1099-43002022-08-01248107910.3390/e24081079Information Theoretic Methods for Variable Selection—A ReviewJan Mielniczuk0Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, PolandWe review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.https://www.mdpi.com/1099-4300/24/8/1079conditional independenceinteraction informationMöbius expansionMarkov blanketfeature selection
spellingShingle Jan Mielniczuk
Information Theoretic Methods for Variable Selection—A Review
Entropy
conditional independence
interaction information
Möbius expansion
Markov blanket
feature selection
title Information Theoretic Methods for Variable Selection—A Review
title_full Information Theoretic Methods for Variable Selection—A Review
title_fullStr Information Theoretic Methods for Variable Selection—A Review
title_full_unstemmed Information Theoretic Methods for Variable Selection—A Review
title_short Information Theoretic Methods for Variable Selection—A Review
title_sort information theoretic methods for variable selection a review
topic conditional independence
interaction information
Möbius expansion
Markov blanket
feature selection
url https://www.mdpi.com/1099-4300/24/8/1079
work_keys_str_mv AT janmielniczuk informationtheoreticmethodsforvariableselectionareview