Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data

Abstract Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host–pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strai...

Full description

Bibliographic Details
Main Authors: Gary Napier, David Couvin, Guislaine Refrégier, Christophe Guyeux, Conor J. Meehan, Christophe Sola, Susana Campino, Jody Phelan, Taane G. Clark
Format: Article
Language:English
Published: Nature Portfolio 2023-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-38384-3
_version_ 1797778941272915968
author Gary Napier
David Couvin
Guislaine Refrégier
Christophe Guyeux
Conor J. Meehan
Christophe Sola
Susana Campino
Jody Phelan
Taane G. Clark
author_facet Gary Napier
David Couvin
Guislaine Refrégier
Christophe Guyeux
Conor J. Meehan
Christophe Sola
Susana Campino
Jody Phelan
Taane G. Clark
author_sort Gary Napier
collection DOAJ
description Abstract Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host–pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strain-types, though recent advances in whole genome sequencing (WGS) technology have led to their characterization using SNP-based sub-lineage nomenclature. Notwithstanding, spoligotyping remains an important tool and there is a need to study the congruence between spoligotyping-based and SNP-based sub-lineage assignation. To achieve this, an in silico spoligotype prediction method (“Spolpred2”) was developed and integrated into TB-Profiler. Lineage and spoligotype predictions were generated for > 28 k isolates and the overlap between strain-types was characterized. Major spoligotype families detected were Beijing (25.6%), T (18.6%), LAM (13.1%), CAS (9.4%), and EAI (8.3%), and these broadly followed known geographic distributions. Most spoligotypes were perfectly correlated with the main MTBC lineages (L1-L7, plus animal). Conversely, at lower levels of the sub-lineage system, the relationship breaks down, with only 65% of spoligotypes being perfectly associated with a sub-lineage at the second or subsequent levels of the hierarchy. Our work supports the use of spoligotyping (membrane or WGS-based) for low-resolution surveillance, and WGS or SNP-based systems for higher-resolution studies.
first_indexed 2024-03-12T23:24:47Z
format Article
id doaj.art-fcafc4a52ea941b0a4dbdda8c79a94e7
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-12T23:24:47Z
publishDate 2023-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-fcafc4a52ea941b0a4dbdda8c79a94e72023-07-16T11:16:37ZengNature PortfolioScientific Reports2045-23222023-07-011311710.1038/s41598-023-38384-3Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing dataGary Napier0David Couvin1Guislaine Refrégier2Christophe Guyeux3Conor J. Meehan4Christophe Sola5Susana Campino6Jody Phelan7Taane G. Clark8Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical MedicineInstitut Pasteur de la GuadeloupeUniversité Paris-SaclayDISC Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, Univ. Bourgogne Franche-Comté (UBFC)Nottingham Trent UniversityUniversité Paris-SaclayFaculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical MedicineFaculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical MedicineFaculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical MedicineAbstract Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host–pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strain-types, though recent advances in whole genome sequencing (WGS) technology have led to their characterization using SNP-based sub-lineage nomenclature. Notwithstanding, spoligotyping remains an important tool and there is a need to study the congruence between spoligotyping-based and SNP-based sub-lineage assignation. To achieve this, an in silico spoligotype prediction method (“Spolpred2”) was developed and integrated into TB-Profiler. Lineage and spoligotype predictions were generated for > 28 k isolates and the overlap between strain-types was characterized. Major spoligotype families detected were Beijing (25.6%), T (18.6%), LAM (13.1%), CAS (9.4%), and EAI (8.3%), and these broadly followed known geographic distributions. Most spoligotypes were perfectly correlated with the main MTBC lineages (L1-L7, plus animal). Conversely, at lower levels of the sub-lineage system, the relationship breaks down, with only 65% of spoligotypes being perfectly associated with a sub-lineage at the second or subsequent levels of the hierarchy. Our work supports the use of spoligotyping (membrane or WGS-based) for low-resolution surveillance, and WGS or SNP-based systems for higher-resolution studies.https://doi.org/10.1038/s41598-023-38384-3
spellingShingle Gary Napier
David Couvin
Guislaine Refrégier
Christophe Guyeux
Conor J. Meehan
Christophe Sola
Susana Campino
Jody Phelan
Taane G. Clark
Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
Scientific Reports
title Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
title_full Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
title_fullStr Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
title_full_unstemmed Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
title_short Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
title_sort comparison of in silico predicted mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data
url https://doi.org/10.1038/s41598-023-38384-3
work_keys_str_mv AT garynapier comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT davidcouvin comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT guislainerefregier comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT christopheguyeux comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT conorjmeehan comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT christophesola comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT susanacampino comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT jodyphelan comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata
AT taanegclark comparisonofinsilicopredictedmycobacteriumtuberculosisspoligotypesandlineagesfromwholegenomesequencingdata