Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2017-09-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/3729.pdf |
_version_ | 1797420464701702144 |
---|---|
author | Nathan D. Olson Justin M. Zook Jayne B. Morrow Nancy J. Lin |
author_facet | Nathan D. Olson Justin M. Zook Jayne B. Morrow Nancy J. Lin |
author_sort | Nathan D. Olson |
collection | DOAJ |
description | High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods. |
first_indexed | 2024-03-09T07:02:51Z |
format | Article |
id | doaj.art-16eb5ed596b7475580f26c33a7328c53 |
institution | Directory Open Access Journal |
issn | 2167-8359 |
language | English |
last_indexed | 2024-03-09T07:02:51Z |
publishDate | 2017-09-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ |
spelling | doaj.art-16eb5ed596b7475580f26c33a7328c532023-12-03T09:47:17ZengPeerJ Inc.PeerJ2167-83592017-09-015e372910.7717/peerj.3729Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing dataNathan D. Olson0Justin M. Zook1Jayne B. Morrow2Nancy J. Lin3Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaHigh sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.https://peerj.com/articles/3729.pdfGenomic purityWhole genome sequencingBioinformaticsBiodetectionMicrobial materialsReference materials |
spellingShingle | Nathan D. Olson Justin M. Zook Jayne B. Morrow Nancy J. Lin Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data PeerJ Genomic purity Whole genome sequencing Bioinformatics Biodetection Microbial materials Reference materials |
title | Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data |
title_full | Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data |
title_fullStr | Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data |
title_full_unstemmed | Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data |
title_short | Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data |
title_sort | challenging a bioinformatic tool s ability to detect microbial contaminants using in silico whole genome sequencing data |
topic | Genomic purity Whole genome sequencing Bioinformatics Biodetection Microbial materials Reference materials |
url | https://peerj.com/articles/3729.pdf |
work_keys_str_mv | AT nathandolson challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata AT justinmzook challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata AT jaynebmorrow challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata AT nancyjlin challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata |