Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept

Whole genome sequencing is increasingly recognized as the most informative approach for characterization of bacterial isolates. Success of the routine use of this technology in public health laboratories depends on the availability of well-characterized and verified data analysis methods. However, m...

Full description

Bibliographic Details
Main Authors: Assia Saltykova, Wesley Mattheus, Sophie Bertrand, Nancy H. C. Roosens, Kathleen Marchal, Sigrid C. J. De Keersmaecker
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-12-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fmicb.2019.02897/full
_version_ 1818256795731230720
author Assia Saltykova
Assia Saltykova
Wesley Mattheus
Sophie Bertrand
Nancy H. C. Roosens
Kathleen Marchal
Kathleen Marchal
Sigrid C. J. De Keersmaecker
author_facet Assia Saltykova
Assia Saltykova
Wesley Mattheus
Sophie Bertrand
Nancy H. C. Roosens
Kathleen Marchal
Kathleen Marchal
Sigrid C. J. De Keersmaecker
author_sort Assia Saltykova
collection DOAJ
description Whole genome sequencing is increasingly recognized as the most informative approach for characterization of bacterial isolates. Success of the routine use of this technology in public health laboratories depends on the availability of well-characterized and verified data analysis methods. However, multiple subtyping workflows are now often being used for a single organism, and differences between them are not always well described. Moreover, methodologies for comparison of subtyping workflows, and assessment of their performance are only beginning to emerge. Current work focuses on the detailed comparison of WGS-based subtyping workflows and evaluation of their suitability for the organism and the research context in question. We evaluated the performance of pipelines used for subtyping of Neisseria meningitidis, including the currently widely applied cgMLST approach and different SNP-based methods. In addition, the impact of the use of different tools for detection and filtering of recombinant regions and of different reference genomes were tested. Our benchmarking analysis included both assessment of technical performance of the pipelines and functional comparison of the generated genetic distance matrices and phylogenetic trees. It was carried out using replicate sequencing datasets of high- and low-coverage, consisting mainly of isolates belonging to the clonal complex 269. We demonstrated that cgMLST and some of the SNP-based subtyping workflows showed very good performance characteristics and highly similar genetic distance matrices and phylogenetic trees with isolates belonging to the same clonal complex. However, only two of the tested workflows demonstrated reproducible results for a group of more closely related isolates. Additionally, results of the SNP-based subtyping workflows were to some level dependent on the reference genome used. Interestingly, the use of recombination-filtering software generally reduced the similarity between the gene-by-gene and SNP-based methodologies for subtyping of N. meningitidis. Our study, where N. meningitidis was taken as an example, clearly highlights the need for more benchmarking comparative studies to eventually contribute to a justified use of a specific WGS data analysis workflow within an international public health laboratory context.
first_indexed 2024-12-12T17:33:26Z
format Article
id doaj.art-f61e58c0827f496aa2fa8f76a9a0a453
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-12-12T17:33:26Z
publishDate 2019-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-f61e58c0827f496aa2fa8f76a9a0a4532022-12-22T00:17:19ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2019-12-011010.3389/fmicb.2019.02897483133Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of ConceptAssia Saltykova0Assia Saltykova1Wesley Mattheus2Sophie Bertrand3Nancy H. C. Roosens4Kathleen Marchal5Kathleen Marchal6Sigrid C. J. De Keersmaecker7Transversal Activities in Applied Genomics, Sciensano, Brussels, BelgiumIDLab, IMEC, Department of Information Technology, Ghent University, Ghent, BelgiumBelgian National Reference Centre for Neisseria, Human Bacterial Diseases, Sciensano, Brussels, BelgiumBelgian National Reference Centre for Neisseria, Human Bacterial Diseases, Sciensano, Brussels, BelgiumTransversal Activities in Applied Genomics, Sciensano, Brussels, BelgiumIDLab, IMEC, Department of Information Technology, Ghent University, Ghent, BelgiumDepartment of Plant Biotechnology and Bioinformatics, VIB, Ghent University, Ghent, BelgiumTransversal Activities in Applied Genomics, Sciensano, Brussels, BelgiumWhole genome sequencing is increasingly recognized as the most informative approach for characterization of bacterial isolates. Success of the routine use of this technology in public health laboratories depends on the availability of well-characterized and verified data analysis methods. However, multiple subtyping workflows are now often being used for a single organism, and differences between them are not always well described. Moreover, methodologies for comparison of subtyping workflows, and assessment of their performance are only beginning to emerge. Current work focuses on the detailed comparison of WGS-based subtyping workflows and evaluation of their suitability for the organism and the research context in question. We evaluated the performance of pipelines used for subtyping of Neisseria meningitidis, including the currently widely applied cgMLST approach and different SNP-based methods. In addition, the impact of the use of different tools for detection and filtering of recombinant regions and of different reference genomes were tested. Our benchmarking analysis included both assessment of technical performance of the pipelines and functional comparison of the generated genetic distance matrices and phylogenetic trees. It was carried out using replicate sequencing datasets of high- and low-coverage, consisting mainly of isolates belonging to the clonal complex 269. We demonstrated that cgMLST and some of the SNP-based subtyping workflows showed very good performance characteristics and highly similar genetic distance matrices and phylogenetic trees with isolates belonging to the same clonal complex. However, only two of the tested workflows demonstrated reproducible results for a group of more closely related isolates. Additionally, results of the SNP-based subtyping workflows were to some level dependent on the reference genome used. Interestingly, the use of recombination-filtering software generally reduced the similarity between the gene-by-gene and SNP-based methodologies for subtyping of N. meningitidis. Our study, where N. meningitidis was taken as an example, clearly highlights the need for more benchmarking comparative studies to eventually contribute to a justified use of a specific WGS data analysis workflow within an international public health laboratory context.https://www.frontiersin.org/article/10.3389/fmicb.2019.02897/fullNeisseria meningitidiswhole genome sequencingpublic healthsubtypingdata analysisbenchmarking
spellingShingle Assia Saltykova
Assia Saltykova
Wesley Mattheus
Sophie Bertrand
Nancy H. C. Roosens
Kathleen Marchal
Kathleen Marchal
Sigrid C. J. De Keersmaecker
Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
Frontiers in Microbiology
Neisseria meningitidis
whole genome sequencing
public health
subtyping
data analysis
benchmarking
title Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
title_full Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
title_fullStr Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
title_full_unstemmed Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
title_short Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: Neisseria meningitidis as a Proof of Concept
title_sort detailed evaluation of data analysis tools for subtyping of bacterial isolates based on whole genome sequencing neisseria meningitidis as a proof of concept
topic Neisseria meningitidis
whole genome sequencing
public health
subtyping
data analysis
benchmarking
url https://www.frontiersin.org/article/10.3389/fmicb.2019.02897/full
work_keys_str_mv AT assiasaltykova detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT assiasaltykova detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT wesleymattheus detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT sophiebertrand detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT nancyhcroosens detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT kathleenmarchal detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT kathleenmarchal detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept
AT sigridcjdekeersmaecker detailedevaluationofdataanalysistoolsforsubtypingofbacterialisolatesbasedonwholegenomesequencingneisseriameningitidisasaproofofconcept