A probabilistic model to recover individual genomes from metagenomes

Shotgun metagenomics of microbial communities reveal information about strains of relevance for applications in medicine, biotechnology and ecology. Recovering their genomes is a crucial but very challenging step due to the complexity of the underlying biological system and technical factors. Microb...

Full description

Bibliographic Details
Main Authors: Johannes Dröge, Alexander Schönhuth, Alice C. McHardy
Format: Article
Language:English
Published: PeerJ Inc. 2017-05-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-117.pdf
Description
Summary:Shotgun metagenomics of microbial communities reveal information about strains of relevance for applications in medicine, biotechnology and ecology. Recovering their genomes is a crucial but very challenging step due to the complexity of the underlying biological system and technical factors. Microbial communities are heterogeneous, with oftentimes hundreds of present genomes deriving from different species or strains, all at varying abundances and with different degrees of similarity to each other and reference data. We present a versatile probabilistic model for genome recovery and analysis, which aggregates three types of information that are commonly used for genome recovery from metagenomes. As potential applications we showcase metagenome contig classification, genome sample enrichment and genome bin comparisons. The open source implementation MGLEX is available via the Python Package Index and on GitHub and can be embedded into metagenome analysis workflows and programs.
ISSN:2376-5992