Summary: | Vibrio parahaemolyticus is an important human foodborne pathogen whose transmission is associated with the consumption of contaminated seafood with a growing number of infections reported over recent years worldwide. A multilocus sequence typing (MLST) database for V. parahaemolyticus was created in 2008 and a large number of clones have been identified causing severe outbreaks worldwide (ST3), recurrent outbreaks in certain regions (e.g., ST36) or spreading to other regions where they are non-endemic (e.g., ST88 or ST189). The current MLST scheme uses sequences of 7 genes to generate a sequence type (ST) which results in a powerful tool for inferring the population structure of this pathogen, although with limited resolution, especially compared to pulse field gel electrophoresis (PFGE). Application of whole genome sequencing (WGS) has become routine for traceback investigations with core genome MLST (cgMLST) analysis as one of the most straightforward ways to explore complex genomic data in an epidemiological context. Therefore, there is a need to generate a new, portable, standardized, and more advanced system that provides higher resolution and discriminatory power among V. parahaemolyticus strains using WGS data. We sequenced 92 V. parahaemolyticus genomes and used the genome of strain RIMD 2210633 as reference (with a total of 4832 genes) to determine which genes were suitable for establishing a V. parahaemolyticus cgMLST scheme. This analysis resulted in the identification of 2254 suitable core genes for use in the cgMLST scheme. To evaluate the performance of this scheme, we performed a cgMLST analysis of 92 newly sequenced genomes plus an additional 142 strains with genomes available at NCBI. cgMLST analysis was able to distinguish related and unrelated strains including those with the same ST, clearly showing its enhanced resolution over conventional MLST analysis. It also distinguished outbreak-related from unrelated strains within the same ST. The sequences obtained from this work were deposited and are available in the public database (http://pubmlst.org/vparahaemolyticus). Application of this cgMLST scheme to the characterization of V. parahaemolyticus strains provided by different laboratories from around the world will reveal the global picture of the epidemiology, spread, and evolution of this pathogen and will become a powerful tool for outbreak investigations allowing for the unambiguous comparison of strains with global coverage.
|