Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters

In this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific va...

Full description

Bibliographic Details
Main Authors:	Camille Champion, Anne-Claire Brunet, Rémy Burcelin, Jean-Michel Loubes, Laurent Risser
Format:	Article
Language:	English
Published:	MDPI AG 2021-02-01
Series:	Algorithms
Subjects:	feature selection representative variable detection interpretable machine learning regularization complex data graph clustering
Online Access:	https://www.mdpi.com/1999-4893/14/2/66

_version_	1797395722519183360
author	Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser
author_facet	Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser
author_sort	Camille Champion
collection	DOAJ
description	In this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific variable clusters, denoted CORE-clusters, which respect fully interpretable constraints. Each CORE-cluster indeed contains more than a predefined amount of variables and each pair of its variables has a coherent behavior in the observed data. The key advantage of our regularization strategy is therefore that it only requires to tune two intuitive parameters: the minimal dimension of the CORE-clusters and the minimum level of similarity which gathers their variables. Interpreting the role played by a selected representative variable is additionally obvious as it has a similar observed behaviour as a controlled number of other variables. After introducing and justifying this variable selection formalism, we propose two algorithmic strategies to detect the CORE-clusters, one of them scaling particularly well to high-dimensional data. Results obtained on synthetic as well as real data are finally presented.
first_indexed	2024-03-09T00:38:41Z
format	Article
id	doaj.art-a4e6f0d76765453a83ad5a4b0ac407be
institution	Directory Open Access Journal
issn	1999-4893
language	English
last_indexed	2024-03-09T00:38:41Z
publishDate	2021-02-01
publisher	MDPI AG
record_format	Article
series	Algorithms
spelling	doaj.art-a4e6f0d76765453a83ad5a4b0ac407be2023-12-11T17:57:43ZengMDPI AGAlgorithms1999-48932021-02-011426610.3390/a14020066Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-ClustersCamille Champion0Anne-Claire Brunet1Rémy Burcelin2Jean-Michel Loubes3Laurent Risser4Toulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceInstitute of Cardiovascular and Metabolic Diseases INSERM, F-31432 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceIn this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific variable clusters, denoted CORE-clusters, which respect fully interpretable constraints. Each CORE-cluster indeed contains more than a predefined amount of variables and each pair of its variables has a coherent behavior in the observed data. The key advantage of our regularization strategy is therefore that it only requires to tune two intuitive parameters: the minimal dimension of the CORE-clusters and the minimum level of similarity which gathers their variables. Interpreting the role played by a selected representative variable is additionally obvious as it has a similar observed behaviour as a controlled number of other variables. After introducing and justifying this variable selection formalism, we propose two algorithmic strategies to detect the CORE-clusters, one of them scaling particularly well to high-dimensional data. Results obtained on synthetic as well as real data are finally presented.https://www.mdpi.com/1999-4893/14/2/66feature selectionrepresentative variable detectioninterpretable machine learningregularizationcomplex datagraph clustering
spellingShingle	Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters Algorithms feature selection representative variable detection interpretable machine learning regularization complex data graph clustering
title	Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
title_full	Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
title_fullStr	Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
title_full_unstemmed	Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
title_short	Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
title_sort	detection of representative variables in complex systems with interpretable rules using core clusters
topic	feature selection representative variable detection interpretable machine learning regularization complex data graph clustering
url	https://www.mdpi.com/1999-4893/14/2/66
work_keys_str_mv	AT camillechampion detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT anneclairebrunet detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT remyburcelin detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT jeanmichelloubes detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT laurentrisser detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters

Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters

Similar Items