MetaGen: reference-free learning with multiple metagenomic samples
Abstract A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-10-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13059-017-1323-y |
_version_ | 1819083816823685120 |
---|---|
author | Xin Xing Jun S. Liu Wenxuan Zhong |
author_facet | Xin Xing Jun S. Liu Wenxuan Zhong |
author_sort | Xin Xing |
collection | DOAJ |
description | Abstract A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As a trade-off, we require multiple metagenomic samples, usually ≥10 samples, to get highly accurate binning results. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. We demonstrated the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen . |
first_indexed | 2024-12-21T20:38:35Z |
format | Article |
id | doaj.art-f0473b5ef87f44c594224ca35cc641dd |
institution | Directory Open Access Journal |
issn | 1474-760X |
language | English |
last_indexed | 2024-12-21T20:38:35Z |
publishDate | 2017-10-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj.art-f0473b5ef87f44c594224ca35cc641dd2022-12-21T18:51:02ZengBMCGenome Biology1474-760X2017-10-0118111510.1186/s13059-017-1323-yMetaGen: reference-free learning with multiple metagenomic samplesXin Xing0Jun S. Liu1Wenxuan Zhong2Department of Statistics, University of GeorgiaDepartment of Statistics, Harvard UniversityDepartment of Statistics, University of GeorgiaAbstract A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As a trade-off, we require multiple metagenomic samples, usually ≥10 samples, to get highly accurate binning results. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. We demonstrated the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen .http://link.springer.com/article/10.1186/s13059-017-1323-yMetagenomicsBinningMixture modelMultinomialUnsupervised learning |
spellingShingle | Xin Xing Jun S. Liu Wenxuan Zhong MetaGen: reference-free learning with multiple metagenomic samples Genome Biology Metagenomics Binning Mixture model Multinomial Unsupervised learning |
title | MetaGen: reference-free learning with multiple metagenomic samples |
title_full | MetaGen: reference-free learning with multiple metagenomic samples |
title_fullStr | MetaGen: reference-free learning with multiple metagenomic samples |
title_full_unstemmed | MetaGen: reference-free learning with multiple metagenomic samples |
title_short | MetaGen: reference-free learning with multiple metagenomic samples |
title_sort | metagen reference free learning with multiple metagenomic samples |
topic | Metagenomics Binning Mixture model Multinomial Unsupervised learning |
url | http://link.springer.com/article/10.1186/s13059-017-1323-y |
work_keys_str_mv | AT xinxing metagenreferencefreelearningwithmultiplemetagenomicsamples AT junsliu metagenreferencefreelearningwithmultiplemetagenomicsamples AT wenxuanzhong metagenreferencefreelearningwithmultiplemetagenomicsamples |