Species classifier choice is a key consideration when analysing low-complexity food microbiome data

Abstract Background The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing...

Full description

Bibliographic Details
Main Authors:	Aaron M. Walsh, Fiona Crispie, Orla O’Sullivan, Laura Finnegan, Marcus J. Claesson, Paul D. Cotter
Format:	Article
Language:	English
Published:	BMC 2018-03-01
Series:	Microbiome
Subjects:	Shotgun metagenomics Sequencing platform comparison Low-complexity microbiome
Online Access:	http://link.springer.com/article/10.1186/s40168-018-0437-0

_version_	1818900860759965696
author	Aaron M. Walsh Fiona Crispie Orla O’Sullivan Laura Finnegan Marcus J. Claesson Paul D. Cotter
author_facet	Aaron M. Walsh Fiona Crispie Orla O’Sullivan Laura Finnegan Marcus J. Claesson Paul D. Cotter
author_sort	Aaron M. Walsh
collection	DOAJ
description	Abstract Background The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Results Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2 = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2 = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. Conclusions Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did.
first_indexed	2024-12-19T20:10:34Z
format	Article
id	doaj.art-f149e2172aa94edab33cc07799c034ec
institution	Directory Open Access Journal
issn	2049-2618
language	English
last_indexed	2024-12-19T20:10:34Z
publishDate	2018-03-01
publisher	BMC
record_format	Article
series	Microbiome
spelling	doaj.art-f149e2172aa94edab33cc07799c034ec2022-12-21T20:07:19ZengBMCMicrobiome2049-26182018-03-016111510.1186/s40168-018-0437-0Species classifier choice is a key consideration when analysing low-complexity food microbiome dataAaron M. Walsh0Fiona Crispie1Orla O’Sullivan2Laura Finnegan3Marcus J. Claesson4Paul D. Cotter5Teagasc Food Research CentreTeagasc Food Research CentreTeagasc Food Research CentreTeagasc Food Research CentreAPC Microbiome Institute, University College CorkTeagasc Food Research CentreAbstract Background The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Results Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2 = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2 = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. Conclusions Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did.http://link.springer.com/article/10.1186/s40168-018-0437-0Shotgun metagenomicsSequencing platform comparisonLow-complexity microbiome
spellingShingle	Aaron M. Walsh Fiona Crispie Orla O’Sullivan Laura Finnegan Marcus J. Claesson Paul D. Cotter Species classifier choice is a key consideration when analysing low-complexity food microbiome data Microbiome Shotgun metagenomics Sequencing platform comparison Low-complexity microbiome
title	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_full	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_fullStr	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_full_unstemmed	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_short	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_sort	species classifier choice is a key consideration when analysing low complexity food microbiome data
topic	Shotgun metagenomics Sequencing platform comparison Low-complexity microbiome
url	http://link.springer.com/article/10.1186/s40168-018-0437-0
work_keys_str_mv	AT aaronmwalsh speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT fionacrispie speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT orlaosullivan speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT laurafinnegan speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT marcusjclaesson speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT pauldcotter speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata

Species classifier choice is a key consideration when analysing low-complexity food microbiome data

Similar Items