The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.
Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated diff...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2023-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0284031 |
_version_ | 1797843200515244032 |
---|---|
author | Ruijie Xu Sreekumari Rajeev Liliana C M Salvador |
author_facet | Ruijie Xu Sreekumari Rajeev Liliana C M Salvador |
author_sort | Ruijie Xu |
collection | DOAJ |
description | Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study. |
first_indexed | 2024-04-09T17:02:20Z |
format | Article |
id | doaj.art-bf1023c4c19d446a91e00b4cef2df2de |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-04-09T17:02:20Z |
publishDate | 2023-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-bf1023c4c19d446a91e00b4cef2df2de2023-04-21T05:32:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01184e028403110.1371/journal.pone.0284031The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection.Ruijie XuSreekumari RajeevLiliana C M SalvadorShotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study.https://doi.org/10.1371/journal.pone.0284031 |
spellingShingle | Ruijie Xu Sreekumari Rajeev Liliana C M Salvador The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. PLoS ONE |
title | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. |
title_full | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. |
title_fullStr | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. |
title_full_unstemmed | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. |
title_short | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection. |
title_sort | selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
url | https://doi.org/10.1371/journal.pone.0284031 |
work_keys_str_mv | AT ruijiexu theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT sreekumarirajeev theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT lilianacmsalvador theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT ruijiexu selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT sreekumarirajeev selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT lilianacmsalvador selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection |