Accurate genome relative abundance estimation based on shotgun metagenomic reads.

Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates....

Full description

Bibliographic Details
Main Authors:	Li C Xia, Jacob A Cram, Ting Chen, Jed A Fuhrman, Fengzhu Sun
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2011-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC3232206?pdf=render

_version_	1828522152000749568
author	Li C Xia Jacob A Cram Ting Chen Jed A Fuhrman Fengzhu Sun
author_facet	Li C Xia Jacob A Cram Ting Chen Jed A Fuhrman Fengzhu Sun
author_sort	Li C Xia
collection	DOAJ
description	Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data-sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.
first_indexed	2024-12-11T20:02:26Z
format	Article
id	doaj.art-675cd04f9c07458fa1015826d10f7cce
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-12-11T20:02:26Z
publishDate	2011-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-675cd04f9c07458fa1015826d10f7cce2022-12-22T00:52:29ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-01612e2799210.1371/journal.pone.0027992Accurate genome relative abundance estimation based on shotgun metagenomic reads.Li C XiaJacob A CramTing ChenJed A FuhrmanFengzhu SunAccurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data-sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.http://europepmc.org/articles/PMC3232206?pdf=render
spellingShingle	Li C Xia Jacob A Cram Ting Chen Jed A Fuhrman Fengzhu Sun Accurate genome relative abundance estimation based on shotgun metagenomic reads. PLoS ONE
title	Accurate genome relative abundance estimation based on shotgun metagenomic reads.
title_full	Accurate genome relative abundance estimation based on shotgun metagenomic reads.
title_fullStr	Accurate genome relative abundance estimation based on shotgun metagenomic reads.
title_full_unstemmed	Accurate genome relative abundance estimation based on shotgun metagenomic reads.
title_short	Accurate genome relative abundance estimation based on shotgun metagenomic reads.
title_sort	accurate genome relative abundance estimation based on shotgun metagenomic reads
url	http://europepmc.org/articles/PMC3232206?pdf=render
work_keys_str_mv	AT licxia accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads AT jacobacram accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads AT tingchen accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads AT jedafuhrman accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads AT fengzhusun accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads

Accurate genome relative abundance estimation based on shotgun metagenomic reads.

Similar Items