Comparison of three clustering approaches for detecting novel environmental microbial diversity

Discovery of novel diversity in high-throughput sequencing studies is an important aspect in environmental microbial ecology. To evaluate the effects that amplicon clustering methods have on the discovery of novel diversity, we clustered an environmental marine high-throughput sequencing dataset of...

Full description

Bibliographic Details
Main Authors: Dominik Forster, Micah Dunthorn, Thorsten Stoeck, Frédéric Mahé
Format: Article
Language:English
Published: PeerJ Inc. 2016-02-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/1692.pdf
_version_ 1797419801477382144
author Dominik Forster
Micah Dunthorn
Thorsten Stoeck
Frédéric Mahé
author_facet Dominik Forster
Micah Dunthorn
Thorsten Stoeck
Frédéric Mahé
author_sort Dominik Forster
collection DOAJ
description Discovery of novel diversity in high-throughput sequencing studies is an important aspect in environmental microbial ecology. To evaluate the effects that amplicon clustering methods have on the discovery of novel diversity, we clustered an environmental marine high-throughput sequencing dataset of protist amplicons together with reference sequences from the taxonomically curated Protist Ribosomal Reference (PR2) database using three de novo approaches: sequence similarity networks, USEARCH, and Swarm. The potentially novel diversity uncovered by each clustering approach differed drastically in the number of operational taxonomic units (OTUs) and in the number of environmental amplicons in these novel diversity OTUs. Global pairwise alignment comparisons revealed that numerous amplicons classified as potentially novel by USEARCH and Swarm were more than 97% similar to references of PR2. Using shortest path analyses on sequence similarity network OTUs and Swarm OTUs we found additional novel diversity within OTUs that would have gone unnoticed without further exploiting their underlying network topologies. These results demonstrate that graph theory provides powerful tools for microbial ecology and the analysis of environmental high-throughput sequencing datasets. Furthermore, sequence similarity networks were most accurate in delineating novel diversity from previously discovered diversity.
first_indexed 2024-03-09T06:53:33Z
format Article
id doaj.art-cf3edeff9de946b09cec1a755ba91047
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T06:53:33Z
publishDate 2016-02-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-cf3edeff9de946b09cec1a755ba910472023-12-03T10:16:13ZengPeerJ Inc.PeerJ2167-83592016-02-014e169210.7717/peerj.1692Comparison of three clustering approaches for detecting novel environmental microbial diversityDominik Forster0Micah Dunthorn1Thorsten Stoeck2Frédéric Mahé3Department of Ecology, Technische Universität Kaiserslautern, Kaiserslautern, GermanyDepartment of Ecology, Technische Universität Kaiserslautern, Kaiserslautern, GermanyDepartment of Ecology, Technische Universität Kaiserslautern, Kaiserslautern, GermanyDepartment of Ecology, Technische Universität Kaiserslautern, Kaiserslautern, GermanyDiscovery of novel diversity in high-throughput sequencing studies is an important aspect in environmental microbial ecology. To evaluate the effects that amplicon clustering methods have on the discovery of novel diversity, we clustered an environmental marine high-throughput sequencing dataset of protist amplicons together with reference sequences from the taxonomically curated Protist Ribosomal Reference (PR2) database using three de novo approaches: sequence similarity networks, USEARCH, and Swarm. The potentially novel diversity uncovered by each clustering approach differed drastically in the number of operational taxonomic units (OTUs) and in the number of environmental amplicons in these novel diversity OTUs. Global pairwise alignment comparisons revealed that numerous amplicons classified as potentially novel by USEARCH and Swarm were more than 97% similar to references of PR2. Using shortest path analyses on sequence similarity network OTUs and Swarm OTUs we found additional novel diversity within OTUs that would have gone unnoticed without further exploiting their underlying network topologies. These results demonstrate that graph theory provides powerful tools for microbial ecology and the analysis of environmental high-throughput sequencing datasets. Furthermore, sequence similarity networks were most accurate in delineating novel diversity from previously discovered diversity.https://peerj.com/articles/1692.pdfEnvironmental diversityBarcodingMolecular operational taxonomic unit
spellingShingle Dominik Forster
Micah Dunthorn
Thorsten Stoeck
Frédéric Mahé
Comparison of three clustering approaches for detecting novel environmental microbial diversity
PeerJ
Environmental diversity
Barcoding
Molecular operational taxonomic unit
title Comparison of three clustering approaches for detecting novel environmental microbial diversity
title_full Comparison of three clustering approaches for detecting novel environmental microbial diversity
title_fullStr Comparison of three clustering approaches for detecting novel environmental microbial diversity
title_full_unstemmed Comparison of three clustering approaches for detecting novel environmental microbial diversity
title_short Comparison of three clustering approaches for detecting novel environmental microbial diversity
title_sort comparison of three clustering approaches for detecting novel environmental microbial diversity
topic Environmental diversity
Barcoding
Molecular operational taxonomic unit
url https://peerj.com/articles/1692.pdf
work_keys_str_mv AT dominikforster comparisonofthreeclusteringapproachesfordetectingnovelenvironmentalmicrobialdiversity
AT micahdunthorn comparisonofthreeclusteringapproachesfordetectingnovelenvironmentalmicrobialdiversity
AT thorstenstoeck comparisonofthreeclusteringapproachesfordetectingnovelenvironmentalmicrobialdiversity
AT fredericmahe comparisonofthreeclusteringapproachesfordetectingnovelenvironmentalmicrobialdiversity