Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data

Abstract Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host–pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strai...

Full description

Bibliographic Details
Main Authors: Gary Napier, David Couvin, Guislaine Refrégier, Christophe Guyeux, Conor J. Meehan, Christophe Sola, Susana Campino, Jody Phelan, Taane G. Clark
Format: Article
Language:English
Published: Nature Portfolio 2023-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-38384-3
Description
Summary:Abstract Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host–pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strain-types, though recent advances in whole genome sequencing (WGS) technology have led to their characterization using SNP-based sub-lineage nomenclature. Notwithstanding, spoligotyping remains an important tool and there is a need to study the congruence between spoligotyping-based and SNP-based sub-lineage assignation. To achieve this, an in silico spoligotype prediction method (“Spolpred2”) was developed and integrated into TB-Profiler. Lineage and spoligotype predictions were generated for > 28 k isolates and the overlap between strain-types was characterized. Major spoligotype families detected were Beijing (25.6%), T (18.6%), LAM (13.1%), CAS (9.4%), and EAI (8.3%), and these broadly followed known geographic distributions. Most spoligotypes were perfectly correlated with the main MTBC lineages (L1-L7, plus animal). Conversely, at lower levels of the sub-lineage system, the relationship breaks down, with only 65% of spoligotypes being perfectly associated with a sub-lineage at the second or subsequent levels of the hierarchy. Our work supports the use of spoligotyping (membrane or WGS-based) for low-resolution surveillance, and WGS or SNP-based systems for higher-resolution studies.
ISSN:2045-2322