Sparse and compositionally robust inference of microbial ecological networks.

16S ribosomal RNA (rRNA) gene and other environmental sequencing techniques provide snapshots of microbial communities, revealing phylogeny and the abundances of microbial populations across diverse ecosystems. While changes in microbial community structure are demonstrably associated with certain e...

Full description

Bibliographic Details
Main Authors: Zachary D Kurtz, Christian L Müller, Emily R Miraldi, Dan R Littman, Martin J Blaser, Richard A Bonneau
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-05-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC4423992?pdf=render
_version_ 1818042576398188544
author Zachary D Kurtz
Christian L Müller
Emily R Miraldi
Dan R Littman
Martin J Blaser
Richard A Bonneau
author_facet Zachary D Kurtz
Christian L Müller
Emily R Miraldi
Dan R Littman
Martin J Blaser
Richard A Bonneau
author_sort Zachary D Kurtz
collection DOAJ
description 16S ribosomal RNA (rRNA) gene and other environmental sequencing techniques provide snapshots of microbial communities, revealing phylogeny and the abundances of microbial populations across diverse ecosystems. While changes in microbial community structure are demonstrably associated with certain environmental conditions (from metabolic and immunological health in mammals to ecological stability in soils and oceans), identification of underlying mechanisms requires new statistical tools, as these datasets present several technical challenges. First, the abundances of microbial operational taxonomic units (OTUs) from amplicon-based datasets are compositional. Counts are normalized to the total number of counts in the sample. Thus, microbial abundances are not independent, and traditional statistical metrics (e.g., correlation) for the detection of OTU-OTU relationships can lead to spurious results. Secondly, microbial sequencing-based studies typically measure hundreds of OTUs on only tens to hundreds of samples; thus, inference of OTU-OTU association networks is severely under-powered, and additional information (or assumptions) are required for accurate inference. Here, we present SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. To reconstruct the network, SPIEC-EASI relies on algorithms for sparse neighborhood and inverse covariance selection. To provide a synthetic benchmark in the absence of an experimentally validated gold-standard network, SPIEC-EASI is accompanied by a set of computational tools to generate OTU count data from a set of diverse underlying network topologies. SPIEC-EASI outperforms state-of-the-art methods to recover edges and network properties on synthetic data under a variety of scenarios. SPIEC-EASI also reproducibly predicts previously unknown microbial associations using data from the American Gut project.
first_indexed 2024-12-10T08:48:31Z
format Article
id doaj.art-752f5ef6eeac4cfc919259032b5c8211
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-10T08:48:31Z
publishDate 2015-05-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-752f5ef6eeac4cfc919259032b5c82112022-12-22T01:55:40ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582015-05-01115e100422610.1371/journal.pcbi.1004226Sparse and compositionally robust inference of microbial ecological networks.Zachary D KurtzChristian L MüllerEmily R MiraldiDan R LittmanMartin J BlaserRichard A Bonneau16S ribosomal RNA (rRNA) gene and other environmental sequencing techniques provide snapshots of microbial communities, revealing phylogeny and the abundances of microbial populations across diverse ecosystems. While changes in microbial community structure are demonstrably associated with certain environmental conditions (from metabolic and immunological health in mammals to ecological stability in soils and oceans), identification of underlying mechanisms requires new statistical tools, as these datasets present several technical challenges. First, the abundances of microbial operational taxonomic units (OTUs) from amplicon-based datasets are compositional. Counts are normalized to the total number of counts in the sample. Thus, microbial abundances are not independent, and traditional statistical metrics (e.g., correlation) for the detection of OTU-OTU relationships can lead to spurious results. Secondly, microbial sequencing-based studies typically measure hundreds of OTUs on only tens to hundreds of samples; thus, inference of OTU-OTU association networks is severely under-powered, and additional information (or assumptions) are required for accurate inference. Here, we present SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. To reconstruct the network, SPIEC-EASI relies on algorithms for sparse neighborhood and inverse covariance selection. To provide a synthetic benchmark in the absence of an experimentally validated gold-standard network, SPIEC-EASI is accompanied by a set of computational tools to generate OTU count data from a set of diverse underlying network topologies. SPIEC-EASI outperforms state-of-the-art methods to recover edges and network properties on synthetic data under a variety of scenarios. SPIEC-EASI also reproducibly predicts previously unknown microbial associations using data from the American Gut project.http://europepmc.org/articles/PMC4423992?pdf=render
spellingShingle Zachary D Kurtz
Christian L Müller
Emily R Miraldi
Dan R Littman
Martin J Blaser
Richard A Bonneau
Sparse and compositionally robust inference of microbial ecological networks.
PLoS Computational Biology
title Sparse and compositionally robust inference of microbial ecological networks.
title_full Sparse and compositionally robust inference of microbial ecological networks.
title_fullStr Sparse and compositionally robust inference of microbial ecological networks.
title_full_unstemmed Sparse and compositionally robust inference of microbial ecological networks.
title_short Sparse and compositionally robust inference of microbial ecological networks.
title_sort sparse and compositionally robust inference of microbial ecological networks
url http://europepmc.org/articles/PMC4423992?pdf=render
work_keys_str_mv AT zacharydkurtz sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks
AT christianlmuller sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks
AT emilyrmiraldi sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks
AT danrlittman sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks
AT martinjblaser sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks
AT richardabonneau sparseandcompositionallyrobustinferenceofmicrobialecologicalnetworks