FastGroup: A program to dereplicate libraries of 16S rDNA sequences

<p>Abstract</p> <p>Background</p> <p>Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods,...

Full description

Bibliographic Details
Main Authors: Rohwer Forest, Seguritan Victor
Format: Article
Language:English
Published: BMC 2001-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/2/9
_version_ 1811285932470435840
author Rohwer Forest
Seguritan Victor
author_facet Rohwer Forest
Seguritan Victor
author_sort Rohwer Forest
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library.</p> <p>Results</p> <p>FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences >=97% identical to each other.</p> <p>Conclusions</p> <p>The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses.</p>
first_indexed 2024-04-13T02:52:07Z
format Article
id doaj.art-4c50f80a3e1d4d70adb2f220de10bc77
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T02:52:07Z
publishDate 2001-10-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-4c50f80a3e1d4d70adb2f220de10bc772022-12-22T03:05:49ZengBMCBMC Bioinformatics1471-21052001-10-0121910.1186/1471-2105-2-9FastGroup: A program to dereplicate libraries of 16S rDNA sequencesRohwer ForestSeguritan Victor<p>Abstract</p> <p>Background</p> <p>Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library.</p> <p>Results</p> <p>FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences >=97% identical to each other.</p> <p>Conclusions</p> <p>The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses.</p>http://www.biomedcentral.com/1471-2105/2/9
spellingShingle Rohwer Forest
Seguritan Victor
FastGroup: A program to dereplicate libraries of 16S rDNA sequences
BMC Bioinformatics
title FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_full FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_fullStr FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_full_unstemmed FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_short FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_sort fastgroup a program to dereplicate libraries of 16s rdna sequences
url http://www.biomedcentral.com/1471-2105/2/9
work_keys_str_mv AT rohwerforest fastgroupaprogramtodereplicatelibrariesof16srdnasequences
AT seguritanvictor fastgroupaprogramtodereplicatelibrariesof16srdnasequences