A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data

Fueled by technological advancement, there has been a surge of human microbiome studies surveying the microbial communities associated with the human body and their links with health and disease. As a complement to the human genome, the human microbiome holds great potential for precision medicine....

Full description

Bibliographic Details
Main Authors: Jian Xiao, Li Chen, Yue Yu, Xianyang Zhang, Jun Chen
Format: Article
Language:English
Published: Frontiers Media S.A. 2018-12-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fmicb.2018.03112/full
_version_ 1828421798869336064
author Jian Xiao
Jian Xiao
Li Chen
Yue Yu
Xianyang Zhang
Jun Chen
author_facet Jian Xiao
Jian Xiao
Li Chen
Yue Yu
Xianyang Zhang
Jun Chen
author_sort Jian Xiao
collection DOAJ
description Fueled by technological advancement, there has been a surge of human microbiome studies surveying the microbial communities associated with the human body and their links with health and disease. As a complement to the human genome, the human microbiome holds great potential for precision medicine. Efficient predictive models based on microbiome data could be potentially used in various clinical applications such as disease diagnosis, patient stratification and drug response prediction. One important characteristic of the microbial community data is the phylogenetic tree that relates all the microbial taxa based on their evolutionary history. The phylogenetic tree is an informative prior for more efficient prediction since the microbial community changes are usually not randomly distributed on the tree but tend to occur in clades at varying phylogenetic depths (clustered signal). Although community-wide changes are possible for some conditions, it is also likely that the community changes are only associated with a small subset of “marker” taxa (sparse signal). Unfortunately, predictive models of microbial community data taking into account both the sparsity and the tree structure remain under-developed. In this paper, we propose a predictive framework to exploit sparse and clustered microbiome signals using a phylogeny-regularized sparse regression model. Our approach is motivated by evolutionary theory, where a natural correlation structure among microbial taxa exists according to the phylogenetic relationship. A novel phylogeny-based smoothness penalty is proposed to smooth the coefficients of the microbial taxa with respect to the phylogenetic tree. Using simulated and real datasets, we show that our method achieves better prediction performance than competing sparse regression methods for sparse and clustered microbiome signals.
first_indexed 2024-12-10T15:38:02Z
format Article
id doaj.art-a2b3147a66934a92a9c04940f9805f99
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-12-10T15:38:02Z
publishDate 2018-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-a2b3147a66934a92a9c04940f9805f992022-12-22T01:43:11ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2018-12-01910.3389/fmicb.2018.03112422587A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community DataJian Xiao0Jian Xiao1Li Chen2Yue Yu3Xianyang Zhang4Jun Chen5Division of Biomedical Statistics and Informatics, Center for Individualized Medicine, Mayo ClinicRochester, MN, United StatesSchool of Statistics and MathematicsZhongnan University of Economics and Law, Wuhan, ChinaDepartment of Health Outcomes Research and Policy, Harrison School of Pharmacy, Auburn UniversityAuburn, AL, United StatesDivision of Biomedical Statistics and Informatics, Center for Individualized Medicine, Mayo ClinicRochester, MN, United StatesDepartment of Statistics, Texas A&M UniversityCollege Station, TX, United StatesDivision of Biomedical Statistics and Informatics, Center for Individualized Medicine, Mayo ClinicRochester, MN, United StatesFueled by technological advancement, there has been a surge of human microbiome studies surveying the microbial communities associated with the human body and their links with health and disease. As a complement to the human genome, the human microbiome holds great potential for precision medicine. Efficient predictive models based on microbiome data could be potentially used in various clinical applications such as disease diagnosis, patient stratification and drug response prediction. One important characteristic of the microbial community data is the phylogenetic tree that relates all the microbial taxa based on their evolutionary history. The phylogenetic tree is an informative prior for more efficient prediction since the microbial community changes are usually not randomly distributed on the tree but tend to occur in clades at varying phylogenetic depths (clustered signal). Although community-wide changes are possible for some conditions, it is also likely that the community changes are only associated with a small subset of “marker” taxa (sparse signal). Unfortunately, predictive models of microbial community data taking into account both the sparsity and the tree structure remain under-developed. In this paper, we propose a predictive framework to exploit sparse and clustered microbiome signals using a phylogeny-regularized sparse regression model. Our approach is motivated by evolutionary theory, where a natural correlation structure among microbial taxa exists according to the phylogenetic relationship. A novel phylogeny-based smoothness penalty is proposed to smooth the coefficients of the microbial taxa with respect to the phylogenetic tree. Using simulated and real datasets, we show that our method achieves better prediction performance than competing sparse regression methods for sparse and clustered microbiome signals.https://www.frontiersin.org/article/10.3389/fmicb.2018.03112/fullmicrobiomephylogenetic treesparse generalized linear modelpredictive modelstatistical modelinghigh-dimenisonal statistics
spellingShingle Jian Xiao
Jian Xiao
Li Chen
Yue Yu
Xianyang Zhang
Jun Chen
A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
Frontiers in Microbiology
microbiome
phylogenetic tree
sparse generalized linear model
predictive model
statistical modeling
high-dimenisonal statistics
title A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
title_full A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
title_fullStr A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
title_full_unstemmed A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
title_short A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
title_sort phylogeny regularized sparse regression model for predictive modeling of microbial community data
topic microbiome
phylogenetic tree
sparse generalized linear model
predictive model
statistical modeling
high-dimenisonal statistics
url https://www.frontiersin.org/article/10.3389/fmicb.2018.03112/full
work_keys_str_mv AT jianxiao aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT jianxiao aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT lichen aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT yueyu aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT xianyangzhang aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT junchen aphylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT jianxiao phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT jianxiao phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT lichen phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT yueyu phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT xianyangzhang phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata
AT junchen phylogenyregularizedsparseregressionmodelforpredictivemodelingofmicrobialcommunitydata