Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters
In this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific va...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-02-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/14/2/66 |
_version_ | 1797395722519183360 |
---|---|
author | Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser |
author_facet | Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser |
author_sort | Camille Champion |
collection | DOAJ |
description | In this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific variable clusters, denoted CORE-clusters, which respect fully interpretable constraints. Each CORE-cluster indeed contains more than a predefined amount of variables and each pair of its variables has a coherent behavior in the observed data. The key advantage of our regularization strategy is therefore that it only requires to tune two intuitive parameters: the minimal dimension of the CORE-clusters and the minimum level of similarity which gathers their variables. Interpreting the role played by a selected representative variable is additionally obvious as it has a similar observed behaviour as a controlled number of other variables. After introducing and justifying this variable selection formalism, we propose two algorithmic strategies to detect the CORE-clusters, one of them scaling particularly well to high-dimensional data. Results obtained on synthetic as well as real data are finally presented. |
first_indexed | 2024-03-09T00:38:41Z |
format | Article |
id | doaj.art-a4e6f0d76765453a83ad5a4b0ac407be |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-09T00:38:41Z |
publishDate | 2021-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-a4e6f0d76765453a83ad5a4b0ac407be2023-12-11T17:57:43ZengMDPI AGAlgorithms1999-48932021-02-011426610.3390/a14020066Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-ClustersCamille Champion0Anne-Claire Brunet1Rémy Burcelin2Jean-Michel Loubes3Laurent Risser4Toulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceInstitute of Cardiovascular and Metabolic Diseases INSERM, F-31432 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceToulouse Mathematics Institute (UMR 5219), CNRS, University of Toulouse, F-31062 Toulouse, FranceIn this paper, we present a new framework dedicated to the robust detection of representative variables in high dimensional spaces with a potentially limited number of observations. Representative variables are selected by using an original regularization strategy: they are the center of specific variable clusters, denoted CORE-clusters, which respect fully interpretable constraints. Each CORE-cluster indeed contains more than a predefined amount of variables and each pair of its variables has a coherent behavior in the observed data. The key advantage of our regularization strategy is therefore that it only requires to tune two intuitive parameters: the minimal dimension of the CORE-clusters and the minimum level of similarity which gathers their variables. Interpreting the role played by a selected representative variable is additionally obvious as it has a similar observed behaviour as a controlled number of other variables. After introducing and justifying this variable selection formalism, we propose two algorithmic strategies to detect the CORE-clusters, one of them scaling particularly well to high-dimensional data. Results obtained on synthetic as well as real data are finally presented.https://www.mdpi.com/1999-4893/14/2/66feature selectionrepresentative variable detectioninterpretable machine learningregularizationcomplex datagraph clustering |
spellingShingle | Camille Champion Anne-Claire Brunet Rémy Burcelin Jean-Michel Loubes Laurent Risser Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters Algorithms feature selection representative variable detection interpretable machine learning regularization complex data graph clustering |
title | Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters |
title_full | Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters |
title_fullStr | Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters |
title_full_unstemmed | Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters |
title_short | Detection of Representative Variables in Complex Systems with Interpretable Rules Using Core-Clusters |
title_sort | detection of representative variables in complex systems with interpretable rules using core clusters |
topic | feature selection representative variable detection interpretable machine learning regularization complex data graph clustering |
url | https://www.mdpi.com/1999-4893/14/2/66 |
work_keys_str_mv | AT camillechampion detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT anneclairebrunet detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT remyburcelin detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT jeanmichelloubes detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters AT laurentrisser detectionofrepresentativevariablesincomplexsystemswithinterpretablerulesusingcoreclusters |