Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
Abstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent va...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-01-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-024-05660-6 |
_version_ | 1797273060475142144 |
---|---|
author | Tim U. H. Baumeister Eivind Aadland Roger G. Linington Olav M. Kvalheim |
author_facet | Tim U. H. Baumeister Eivind Aadland Roger G. Linington Olav M. Kvalheim |
author_sort | Tim U. H. Baumeister |
collection | DOAJ |
description | Abstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent variables, but this strategy may lead to loss of resolution and thus produce models with poorer interpretative potential. The aim of this work was to implement and illustrate a method, multivariate pattern analysis (MVPA), which can handle multivariate covariates without compromising resolution or model quality. Results MVPA has been implemented in an open-source R package of the same name, mvpa. To facilitate the usage and interpretation of complex association patterns, mvpa has also been integrated into an R shiny app, mvpaShiny, which can be accessed on www.mvpashiny.org . MVPA utilizes a general projection algorithm that embraces a diversity of possible models. The method handles multicollinear and even linear dependent covariates. MVPA separates the variance in the data into orthogonal parts within the frame of a single joint model: one part describing the relations between covariates, outcome, and explanatory variables and another part describing the “net” predictive association pattern between outcome and explanatory variables. These patterns are visualized and interpreted in variance plots and plots for pattern analysis and ranking according to variable importance. Adjustment for a linear dependent covariate is performed in three steps. First, partial least squares regression with repeated Monte Carlo resampling is used to determine the number of predictive PLS components for a model relating the covariate to the outcome. Second, postprocessing of this PLS model by target projection provided a single component expressing the predictive association pattern between the outcome and the covariate. Third, the outcome and the explanatory variables were adjusted for the covariate by using the target score in the projection algorithm to obtain “net” data. We illustrate the main features of MVPA by investigating the partial mediation of a linearly dependent metabolomics descriptor on the association pattern between a measure of insulin resistance and lifestyle-related factors. Conclusions Our method and implementation in R extend the range of possible analyses and visualizations that can be performed for complex multivariate data structures. The R packages are available on github.com/liningtonlab/mvpa and github.com/liningtonlab/mvpaShiny. |
first_indexed | 2024-03-07T14:38:09Z |
format | Article |
id | doaj.art-c50c39b11b2847fe8c3518ae29dcc382 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-07T14:38:09Z |
publishDate | 2024-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-c50c39b11b2847fe8c3518ae29dcc3822024-03-05T20:32:00ZengBMCBMC Bioinformatics1471-21052024-01-0125112110.1186/s12859-024-05660-6Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear dataTim U. H. Baumeister0Eivind Aadland1Roger G. Linington2Olav M. Kvalheim3Department of Chemistry, Simon Fraser UniversityDepartment of Sport, Food and Natural Sciences, Western Norway University of Applied SciencesDepartment of Chemistry, Simon Fraser UniversityDepartment of Chemistry, University of BergenAbstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent variables, but this strategy may lead to loss of resolution and thus produce models with poorer interpretative potential. The aim of this work was to implement and illustrate a method, multivariate pattern analysis (MVPA), which can handle multivariate covariates without compromising resolution or model quality. Results MVPA has been implemented in an open-source R package of the same name, mvpa. To facilitate the usage and interpretation of complex association patterns, mvpa has also been integrated into an R shiny app, mvpaShiny, which can be accessed on www.mvpashiny.org . MVPA utilizes a general projection algorithm that embraces a diversity of possible models. The method handles multicollinear and even linear dependent covariates. MVPA separates the variance in the data into orthogonal parts within the frame of a single joint model: one part describing the relations between covariates, outcome, and explanatory variables and another part describing the “net” predictive association pattern between outcome and explanatory variables. These patterns are visualized and interpreted in variance plots and plots for pattern analysis and ranking according to variable importance. Adjustment for a linear dependent covariate is performed in three steps. First, partial least squares regression with repeated Monte Carlo resampling is used to determine the number of predictive PLS components for a model relating the covariate to the outcome. Second, postprocessing of this PLS model by target projection provided a single component expressing the predictive association pattern between the outcome and the covariate. Third, the outcome and the explanatory variables were adjusted for the covariate by using the target score in the projection algorithm to obtain “net” data. We illustrate the main features of MVPA by investigating the partial mediation of a linearly dependent metabolomics descriptor on the association pattern between a measure of insulin resistance and lifestyle-related factors. Conclusions Our method and implementation in R extend the range of possible analyses and visualizations that can be performed for complex multivariate data structures. The R packages are available on github.com/liningtonlab/mvpa and github.com/liningtonlab/mvpaShiny.https://doi.org/10.1186/s12859-024-05660-6Multivariate pattern analysisMulticollinear covariatesNet association patternsLatent variable projectionCovariate projectionTarget projection |
spellingShingle | Tim U. H. Baumeister Eivind Aadland Roger G. Linington Olav M. Kvalheim Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data BMC Bioinformatics Multivariate pattern analysis Multicollinear covariates Net association patterns Latent variable projection Covariate projection Target projection |
title | Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data |
title_full | Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data |
title_fullStr | Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data |
title_full_unstemmed | Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data |
title_short | Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data |
title_sort | multivariate pattern analysis a method and software to reveal quantify and visualize predictive association patterns in multicollinear data |
topic | Multivariate pattern analysis Multicollinear covariates Net association patterns Latent variable projection Covariate projection Target projection |
url | https://doi.org/10.1186/s12859-024-05660-6 |
work_keys_str_mv | AT timuhbaumeister multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata AT eivindaadland multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata AT rogerglinington multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata AT olavmkvalheim multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata |