The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.

Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated diff...

Full description

Bibliographic Details
Main Authors: Ruijie Xu, Sreekumari Rajeev, Liliana C M Salvador
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0284031
_version_ 1797843200515244032
author Ruijie Xu
Sreekumari Rajeev
Liliana C M Salvador
author_facet Ruijie Xu
Sreekumari Rajeev
Liliana C M Salvador
author_sort Ruijie Xu
collection DOAJ
description Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study.
first_indexed 2024-04-09T17:02:20Z
format Article
id doaj.art-bf1023c4c19d446a91e00b4cef2df2de
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-09T17:02:20Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-bf1023c4c19d446a91e00b4cef2df2de2023-04-21T05:32:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01184e028403110.1371/journal.pone.0284031The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.Ruijie XuSreekumari RajeevLiliana C M SalvadorShotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study.https://doi.org/10.1371/journal.pone.0284031
spellingShingle Ruijie Xu
Sreekumari Rajeev
Liliana C M Salvador
The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
PLoS ONE
title The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
title_full The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
title_fullStr The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
title_full_unstemmed The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
title_short The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
title_sort selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
url https://doi.org/10.1371/journal.pone.0284031
work_keys_str_mv AT ruijiexu theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT sreekumarirajeev theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT lilianacmsalvador theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT ruijiexu selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT sreekumarirajeev selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT lilianacmsalvador selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection