Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data

High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known...

Full description

Bibliographic Details
Main Authors: Nathan D. Olson, Justin M. Zook, Jayne B. Morrow, Nancy J. Lin
Format: Article
Language:English
Published: PeerJ Inc. 2017-09-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/3729.pdf
_version_ 1797420464701702144
author Nathan D. Olson
Justin M. Zook
Jayne B. Morrow
Nancy J. Lin
author_facet Nathan D. Olson
Justin M. Zook
Jayne B. Morrow
Nancy J. Lin
author_sort Nathan D. Olson
collection DOAJ
description High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.
first_indexed 2024-03-09T07:02:51Z
format Article
id doaj.art-16eb5ed596b7475580f26c33a7328c53
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T07:02:51Z
publishDate 2017-09-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-16eb5ed596b7475580f26c33a7328c532023-12-03T09:47:17ZengPeerJ Inc.PeerJ2167-83592017-09-015e372910.7717/peerj.3729Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing dataNathan D. Olson0Justin M. Zook1Jayne B. Morrow2Nancy J. Lin3Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States of AmericaHigh sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.https://peerj.com/articles/3729.pdfGenomic purityWhole genome sequencingBioinformaticsBiodetectionMicrobial materialsReference materials
spellingShingle Nathan D. Olson
Justin M. Zook
Jayne B. Morrow
Nancy J. Lin
Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
PeerJ
Genomic purity
Whole genome sequencing
Bioinformatics
Biodetection
Microbial materials
Reference materials
title Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
title_full Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
title_fullStr Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
title_full_unstemmed Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
title_short Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data
title_sort challenging a bioinformatic tool s ability to detect microbial contaminants using in silico whole genome sequencing data
topic Genomic purity
Whole genome sequencing
Bioinformatics
Biodetection
Microbial materials
Reference materials
url https://peerj.com/articles/3729.pdf
work_keys_str_mv AT nathandolson challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata
AT justinmzook challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata
AT jaynebmorrow challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata
AT nancyjlin challengingabioinformatictoolsabilitytodetectmicrobialcontaminantsusinginsilicowholegenomesequencingdata