A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations

The investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-in...

Full description

Bibliographic Details
Main Authors: Nicolas Radomski, Sabrina Cadel-Six, Emeline Cherchame, Arnaud Felten, Pauline Barbet, Federica Palma, Ludovic Mallet, Simon Le Hello, François-Xavier Weill, Laurent Guillier, Michel-Yves Mistou
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-10-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/full
_version_ 1818307736223350784
author Nicolas Radomski
Sabrina Cadel-Six
Emeline Cherchame
Arnaud Felten
Pauline Barbet
Federica Palma
Ludovic Mallet
Simon Le Hello
François-Xavier Weill
Laurent Guillier
Michel-Yves Mistou
author_facet Nicolas Radomski
Sabrina Cadel-Six
Emeline Cherchame
Arnaud Felten
Pauline Barbet
Federica Palma
Ludovic Mallet
Simon Le Hello
François-Xavier Weill
Laurent Guillier
Michel-Yves Mistou
author_sort Nicolas Radomski
collection DOAJ
description The investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-intensive and depends on hidden assumptions, pipelines implementation and their parameterization. In the context of FBO investigations, robust links between isolates are required in a timely manner to trigger appropriate management actions. Here, we propose a non-parametric statistical method to assert the relatedness of samples (i.e., outbreak cases) or whether to reject them (i.e., non-outbreak cases). With typical computation running within minutes on a desktop computer, we benchmarked the ability of three non-parametric statistical tests (i.e., Wilcoxon rank-sum, Kolmogorov–Smirnov and Kruskal–Wallis) on six different genomic features (i.e., SNPs, SNPs excluding recombination events, genes, kmers, cgMLST alleles, and wgMLST alleles) to discriminate outbreak cases (i.e., positive control: C+) from non-outbreak cases (i.e., negative control: C−). We leveraged four well-characterized and retrospectively investigated FBOs of Salmonella Typhimurium and its monophasic variant S. 1,4,[5],12:i:- from France, setting positive and negative controls in all the assays. We show that the approaches relying on pairwise SNP differences distinguished all four considered outbreaks in contrast to the other tested genomic features (i.e., genes, kmers, cgMLST alleles, and wgMLST alleles). The freely available non-parametric method written in R has been designed to be independent of both the phylogenomic reconstruction and the detection methods of genomic features (i.e., SNPs, genes, kmers, or alleles), making it widely and easily usable to anybody working on genomic data from suspected samples.
first_indexed 2024-12-13T07:03:07Z
format Article
id doaj.art-8c515303ade445c3bd1ba6bc6da65976
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-12-13T07:03:07Z
publishDate 2019-10-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-8c515303ade445c3bd1ba6bc6da659762022-12-21T23:55:52ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2019-10-011010.3389/fmicb.2019.02413483573A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak InvestigationsNicolas Radomski0Sabrina Cadel-Six1Emeline Cherchame2Arnaud Felten3Pauline Barbet4Federica Palma5Ludovic Mallet6Simon Le Hello7François-Xavier Weill8Laurent Guillier9Michel-Yves Mistou10ANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceUnité des Bactéries Pathogènes Entériques, Institut Pasteur, Centre National de Référence des Salmonella, Paris, FranceUnité des Bactéries Pathogènes Entériques, Institut Pasteur, Centre National de Référence des Salmonella, Paris, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceThe investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-intensive and depends on hidden assumptions, pipelines implementation and their parameterization. In the context of FBO investigations, robust links between isolates are required in a timely manner to trigger appropriate management actions. Here, we propose a non-parametric statistical method to assert the relatedness of samples (i.e., outbreak cases) or whether to reject them (i.e., non-outbreak cases). With typical computation running within minutes on a desktop computer, we benchmarked the ability of three non-parametric statistical tests (i.e., Wilcoxon rank-sum, Kolmogorov–Smirnov and Kruskal–Wallis) on six different genomic features (i.e., SNPs, SNPs excluding recombination events, genes, kmers, cgMLST alleles, and wgMLST alleles) to discriminate outbreak cases (i.e., positive control: C+) from non-outbreak cases (i.e., negative control: C−). We leveraged four well-characterized and retrospectively investigated FBOs of Salmonella Typhimurium and its monophasic variant S. 1,4,[5],12:i:- from France, setting positive and negative controls in all the assays. We show that the approaches relying on pairwise SNP differences distinguished all four considered outbreaks in contrast to the other tested genomic features (i.e., genes, kmers, cgMLST alleles, and wgMLST alleles). The freely available non-parametric method written in R has been designed to be independent of both the phylogenomic reconstruction and the detection methods of genomic features (i.e., SNPs, genes, kmers, or alleles), making it widely and easily usable to anybody working on genomic data from suspected samples.https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/fulloutbreak investigationSalmonella Typhimuriummonophasic S. Typhimurium (S. 14[5]12:i:-)
spellingShingle Nicolas Radomski
Sabrina Cadel-Six
Emeline Cherchame
Arnaud Felten
Pauline Barbet
Federica Palma
Ludovic Mallet
Simon Le Hello
François-Xavier Weill
Laurent Guillier
Michel-Yves Mistou
A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
Frontiers in Microbiology
outbreak investigation
Salmonella Typhimurium
monophasic S. Typhimurium (S. 1
4
[5]
12:i:-)
title A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
title_full A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
title_fullStr A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
title_full_unstemmed A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
title_short A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
title_sort simple and robust statistical method to define genetic relatedness of samples related to outbreaks at the genomic scale application to retrospective salmonella foodborne outbreak investigations
topic outbreak investigation
Salmonella Typhimurium
monophasic S. Typhimurium (S. 1
4
[5]
12:i:-)
url https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/full
work_keys_str_mv AT nicolasradomski asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT sabrinacadelsix asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT emelinecherchame asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT arnaudfelten asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT paulinebarbet asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT federicapalma asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT ludovicmallet asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT simonlehello asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT francoisxavierweill asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT laurentguillier asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT michelyvesmistou asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT nicolasradomski simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT sabrinacadelsix simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT emelinecherchame simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT arnaudfelten simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT paulinebarbet simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT federicapalma simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT ludovicmallet simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT simonlehello simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT francoisxavierweill simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT laurentguillier simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations
AT michelyvesmistou simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations