Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency
<p>Abstract</p> <p>Background</p> <p>Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2010-11-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/11/565 |
_version_ | 1818149692391817216 |
---|---|
author | Hsu Ming-Tsung Su Chien-Hao Weng Francis C Wang Tse-Yi Tsai Huai-Kuang Wang Daryi |
author_facet | Hsu Ming-Tsung Su Chien-Hao Weng Francis C Wang Tse-Yi Tsai Huai-Kuang Wang Daryi |
author_sort | Hsu Ming-Tsung |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies.</p> <p>Results</p> <p>Among the discarded data, we found that 23.7 ± 3.9% of singletons and 14.1 ± 1.0% of contigs were assigned to taxa. The recovery rates for singletons were higher than those for contigs. The <it>Pearson </it>correlation coefficient revealed a high degree of similarity (0.94 ± 0.03 at the phylum rank and 0.80 ± 0.11 at the family rank) between the proposed taxonomic binning approach and those reported in original studies. In addition, an evaluation using simulated data demonstrated the reliability of the proposed approach.</p> <p>Conclusions</p> <p>Our findings suggest that taking account of conserved neighboring gene adjacency improves taxonomic assignment when analyzing metagenomes using Sanger sequencing. In other words, utilizing the conserved gene order as a criterion will reduce the amount of data discarded when analyzing metagenomes.</p> |
first_indexed | 2024-12-11T13:11:04Z |
format | Article |
id | doaj.art-db9e3ad2642f4a4d8e5fea1baad94981 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-11T13:11:04Z |
publishDate | 2010-11-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-db9e3ad2642f4a4d8e5fea1baad949812022-12-22T01:06:10ZengBMCBMC Bioinformatics1471-21052010-11-0111156510.1186/1471-2105-11-565Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacencyHsu Ming-TsungSu Chien-HaoWeng Francis CWang Tse-YiTsai Huai-KuangWang Daryi<p>Abstract</p> <p>Background</p> <p>Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies.</p> <p>Results</p> <p>Among the discarded data, we found that 23.7 ± 3.9% of singletons and 14.1 ± 1.0% of contigs were assigned to taxa. The recovery rates for singletons were higher than those for contigs. The <it>Pearson </it>correlation coefficient revealed a high degree of similarity (0.94 ± 0.03 at the phylum rank and 0.80 ± 0.11 at the family rank) between the proposed taxonomic binning approach and those reported in original studies. In addition, an evaluation using simulated data demonstrated the reliability of the proposed approach.</p> <p>Conclusions</p> <p>Our findings suggest that taking account of conserved neighboring gene adjacency improves taxonomic assignment when analyzing metagenomes using Sanger sequencing. In other words, utilizing the conserved gene order as a criterion will reduce the amount of data discarded when analyzing metagenomes.</p>http://www.biomedcentral.com/1471-2105/11/565 |
spellingShingle | Hsu Ming-Tsung Su Chien-Hao Weng Francis C Wang Tse-Yi Tsai Huai-Kuang Wang Daryi Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency BMC Bioinformatics |
title | Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency |
title_full | Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency |
title_fullStr | Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency |
title_full_unstemmed | Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency |
title_short | Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency |
title_sort | reanalyze unassigned reads in sanger based metagenomic data using conserved gene adjacency |
url | http://www.biomedcentral.com/1471-2105/11/565 |
work_keys_str_mv | AT hsumingtsung reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency AT suchienhao reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency AT wengfrancisc reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency AT wangtseyi reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency AT tsaihuaikuang reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency AT wangdaryi reanalyzeunassignedreadsinsangerbasedmetagenomicdatausingconservedgeneadjacency |