A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations
The investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-in...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2019-10-01
|
Series: | Frontiers in Microbiology |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/full |
_version_ | 1818307736223350784 |
---|---|
author | Nicolas Radomski Sabrina Cadel-Six Emeline Cherchame Arnaud Felten Pauline Barbet Federica Palma Ludovic Mallet Simon Le Hello François-Xavier Weill Laurent Guillier Michel-Yves Mistou |
author_facet | Nicolas Radomski Sabrina Cadel-Six Emeline Cherchame Arnaud Felten Pauline Barbet Federica Palma Ludovic Mallet Simon Le Hello François-Xavier Weill Laurent Guillier Michel-Yves Mistou |
author_sort | Nicolas Radomski |
collection | DOAJ |
description | The investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-intensive and depends on hidden assumptions, pipelines implementation and their parameterization. In the context of FBO investigations, robust links between isolates are required in a timely manner to trigger appropriate management actions. Here, we propose a non-parametric statistical method to assert the relatedness of samples (i.e., outbreak cases) or whether to reject them (i.e., non-outbreak cases). With typical computation running within minutes on a desktop computer, we benchmarked the ability of three non-parametric statistical tests (i.e., Wilcoxon rank-sum, Kolmogorov–Smirnov and Kruskal–Wallis) on six different genomic features (i.e., SNPs, SNPs excluding recombination events, genes, kmers, cgMLST alleles, and wgMLST alleles) to discriminate outbreak cases (i.e., positive control: C+) from non-outbreak cases (i.e., negative control: C−). We leveraged four well-characterized and retrospectively investigated FBOs of Salmonella Typhimurium and its monophasic variant S. 1,4,[5],12:i:- from France, setting positive and negative controls in all the assays. We show that the approaches relying on pairwise SNP differences distinguished all four considered outbreaks in contrast to the other tested genomic features (i.e., genes, kmers, cgMLST alleles, and wgMLST alleles). The freely available non-parametric method written in R has been designed to be independent of both the phylogenomic reconstruction and the detection methods of genomic features (i.e., SNPs, genes, kmers, or alleles), making it widely and easily usable to anybody working on genomic data from suspected samples. |
first_indexed | 2024-12-13T07:03:07Z |
format | Article |
id | doaj.art-8c515303ade445c3bd1ba6bc6da65976 |
institution | Directory Open Access Journal |
issn | 1664-302X |
language | English |
last_indexed | 2024-12-13T07:03:07Z |
publishDate | 2019-10-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Microbiology |
spelling | doaj.art-8c515303ade445c3bd1ba6bc6da659762022-12-21T23:55:52ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2019-10-011010.3389/fmicb.2019.02413483573A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak InvestigationsNicolas Radomski0Sabrina Cadel-Six1Emeline Cherchame2Arnaud Felten3Pauline Barbet4Federica Palma5Ludovic Mallet6Simon Le Hello7François-Xavier Weill8Laurent Guillier9Michel-Yves Mistou10ANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceUnité des Bactéries Pathogènes Entériques, Institut Pasteur, Centre National de Référence des Salmonella, Paris, FranceUnité des Bactéries Pathogènes Entériques, Institut Pasteur, Centre National de Référence des Salmonella, Paris, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceANSES, Laboratory for Food Safety, Université PARIS-EST, Maisons-Alfort, FranceThe investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-intensive and depends on hidden assumptions, pipelines implementation and their parameterization. In the context of FBO investigations, robust links between isolates are required in a timely manner to trigger appropriate management actions. Here, we propose a non-parametric statistical method to assert the relatedness of samples (i.e., outbreak cases) or whether to reject them (i.e., non-outbreak cases). With typical computation running within minutes on a desktop computer, we benchmarked the ability of three non-parametric statistical tests (i.e., Wilcoxon rank-sum, Kolmogorov–Smirnov and Kruskal–Wallis) on six different genomic features (i.e., SNPs, SNPs excluding recombination events, genes, kmers, cgMLST alleles, and wgMLST alleles) to discriminate outbreak cases (i.e., positive control: C+) from non-outbreak cases (i.e., negative control: C−). We leveraged four well-characterized and retrospectively investigated FBOs of Salmonella Typhimurium and its monophasic variant S. 1,4,[5],12:i:- from France, setting positive and negative controls in all the assays. We show that the approaches relying on pairwise SNP differences distinguished all four considered outbreaks in contrast to the other tested genomic features (i.e., genes, kmers, cgMLST alleles, and wgMLST alleles). The freely available non-parametric method written in R has been designed to be independent of both the phylogenomic reconstruction and the detection methods of genomic features (i.e., SNPs, genes, kmers, or alleles), making it widely and easily usable to anybody working on genomic data from suspected samples.https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/fulloutbreak investigationSalmonella Typhimuriummonophasic S. Typhimurium (S. 14[5]12:i:-) |
spellingShingle | Nicolas Radomski Sabrina Cadel-Six Emeline Cherchame Arnaud Felten Pauline Barbet Federica Palma Ludovic Mallet Simon Le Hello François-Xavier Weill Laurent Guillier Michel-Yves Mistou A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations Frontiers in Microbiology outbreak investigation Salmonella Typhimurium monophasic S. Typhimurium (S. 1 4 [5] 12:i:-) |
title | A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations |
title_full | A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations |
title_fullStr | A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations |
title_full_unstemmed | A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations |
title_short | A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations |
title_sort | simple and robust statistical method to define genetic relatedness of samples related to outbreaks at the genomic scale application to retrospective salmonella foodborne outbreak investigations |
topic | outbreak investigation Salmonella Typhimurium monophasic S. Typhimurium (S. 1 4 [5] 12:i:-) |
url | https://www.frontiersin.org/article/10.3389/fmicb.2019.02413/full |
work_keys_str_mv | AT nicolasradomski asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT sabrinacadelsix asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT emelinecherchame asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT arnaudfelten asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT paulinebarbet asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT federicapalma asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT ludovicmallet asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT simonlehello asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT francoisxavierweill asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT laurentguillier asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT michelyvesmistou asimpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT nicolasradomski simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT sabrinacadelsix simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT emelinecherchame simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT arnaudfelten simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT paulinebarbet simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT federicapalma simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT ludovicmallet simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT simonlehello simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT francoisxavierweill simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT laurentguillier simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations AT michelyvesmistou simpleandrobuststatisticalmethodtodefinegeneticrelatednessofsamplesrelatedtooutbreaksatthegenomicscaleapplicationtoretrospectivesalmonellafoodborneoutbreakinvestigations |