Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data

Abstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent va...

Full description

Bibliographic Details
Main Authors: Tim U. H. Baumeister, Eivind Aadland, Roger G. Linington, Olav M. Kvalheim
Format: Article
Language:English
Published: BMC 2024-01-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-024-05660-6
_version_ 1797273060475142144
author Tim U. H. Baumeister
Eivind Aadland
Roger G. Linington
Olav M. Kvalheim
author_facet Tim U. H. Baumeister
Eivind Aadland
Roger G. Linington
Olav M. Kvalheim
author_sort Tim U. H. Baumeister
collection DOAJ
description Abstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent variables, but this strategy may lead to loss of resolution and thus produce models with poorer interpretative potential. The aim of this work was to implement and illustrate a method, multivariate pattern analysis (MVPA), which can handle multivariate covariates without compromising resolution or model quality. Results MVPA has been implemented in an open-source R package of the same name, mvpa. To facilitate the usage and interpretation of complex association patterns, mvpa has also been integrated into an R shiny app, mvpaShiny, which can be accessed on www.mvpashiny.org . MVPA utilizes a general projection algorithm that embraces a diversity of possible models. The method handles multicollinear and even linear dependent covariates. MVPA separates the variance in the data into orthogonal parts within the frame of a single joint model: one part describing the relations between covariates, outcome, and explanatory variables and another part describing the “net” predictive association pattern between outcome and explanatory variables. These patterns are visualized and interpreted in variance plots and plots for pattern analysis and ranking according to variable importance. Adjustment for a linear dependent covariate is performed in three steps. First, partial least squares regression with repeated Monte Carlo resampling is used to determine the number of predictive PLS components for a model relating the covariate to the outcome. Second, postprocessing of this PLS model by target projection provided a single component expressing the predictive association pattern between the outcome and the covariate. Third, the outcome and the explanatory variables were adjusted for the covariate by using the target score in the projection algorithm to obtain “net” data. We illustrate the main features of MVPA by investigating the partial mediation of a linearly dependent metabolomics descriptor on the association pattern between a measure of insulin resistance and lifestyle-related factors. Conclusions Our method and implementation in R extend the range of possible analyses and visualizations that can be performed for complex multivariate data structures. The R packages are available on github.com/liningtonlab/mvpa and github.com/liningtonlab/mvpaShiny.
first_indexed 2024-03-07T14:38:09Z
format Article
id doaj.art-c50c39b11b2847fe8c3518ae29dcc382
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-03-07T14:38:09Z
publishDate 2024-01-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-c50c39b11b2847fe8c3518ae29dcc3822024-03-05T20:32:00ZengBMCBMC Bioinformatics1471-21052024-01-0125112110.1186/s12859-024-05660-6Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear dataTim U. H. Baumeister0Eivind Aadland1Roger G. Linington2Olav M. Kvalheim3Department of Chemistry, Simon Fraser UniversityDepartment of Sport, Food and Natural Sciences, Western Norway University of Applied SciencesDepartment of Chemistry, Simon Fraser UniversityDepartment of Chemistry, University of BergenAbstract Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent variables, but this strategy may lead to loss of resolution and thus produce models with poorer interpretative potential. The aim of this work was to implement and illustrate a method, multivariate pattern analysis (MVPA), which can handle multivariate covariates without compromising resolution or model quality. Results MVPA has been implemented in an open-source R package of the same name, mvpa. To facilitate the usage and interpretation of complex association patterns, mvpa has also been integrated into an R shiny app, mvpaShiny, which can be accessed on www.mvpashiny.org . MVPA utilizes a general projection algorithm that embraces a diversity of possible models. The method handles multicollinear and even linear dependent covariates. MVPA separates the variance in the data into orthogonal parts within the frame of a single joint model: one part describing the relations between covariates, outcome, and explanatory variables and another part describing the “net” predictive association pattern between outcome and explanatory variables. These patterns are visualized and interpreted in variance plots and plots for pattern analysis and ranking according to variable importance. Adjustment for a linear dependent covariate is performed in three steps. First, partial least squares regression with repeated Monte Carlo resampling is used to determine the number of predictive PLS components for a model relating the covariate to the outcome. Second, postprocessing of this PLS model by target projection provided a single component expressing the predictive association pattern between the outcome and the covariate. Third, the outcome and the explanatory variables were adjusted for the covariate by using the target score in the projection algorithm to obtain “net” data. We illustrate the main features of MVPA by investigating the partial mediation of a linearly dependent metabolomics descriptor on the association pattern between a measure of insulin resistance and lifestyle-related factors. Conclusions Our method and implementation in R extend the range of possible analyses and visualizations that can be performed for complex multivariate data structures. The R packages are available on github.com/liningtonlab/mvpa and github.com/liningtonlab/mvpaShiny.https://doi.org/10.1186/s12859-024-05660-6Multivariate pattern analysisMulticollinear covariatesNet association patternsLatent variable projectionCovariate projectionTarget projection
spellingShingle Tim U. H. Baumeister
Eivind Aadland
Roger G. Linington
Olav M. Kvalheim
Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
BMC Bioinformatics
Multivariate pattern analysis
Multicollinear covariates
Net association patterns
Latent variable projection
Covariate projection
Target projection
title Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
title_full Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
title_fullStr Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
title_full_unstemmed Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
title_short Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data
title_sort multivariate pattern analysis a method and software to reveal quantify and visualize predictive association patterns in multicollinear data
topic Multivariate pattern analysis
Multicollinear covariates
Net association patterns
Latent variable projection
Covariate projection
Target projection
url https://doi.org/10.1186/s12859-024-05660-6
work_keys_str_mv AT timuhbaumeister multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata
AT eivindaadland multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata
AT rogerglinington multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata
AT olavmkvalheim multivariatepatternanalysisamethodandsoftwaretorevealquantifyandvisualizepredictiveassociationpatternsinmulticollineardata