TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data

Much of the worldwide dissemination of antibiotic resistance has been driven by resistance gene associations with mobile genetic elements (MGEs), such as plasmids and transposons. Although increasing, our understanding of resistance spread remains relatively limited, as methods for tracking mobile r...

Full description

Bibliographic Details
Main Authors: Sheppard, A, Stoesser, N, German-Mesner, I, Vegesana, K, Walker, A, Crook, D, Mathers, A
Format: Journal article
Published: Microbiology Society 2018
_version_ 1826295615494029312
author Sheppard, A
Stoesser, N
German-Mesner, I
Vegesana, K
Walker, A
Crook, D
Mathers, A
author_facet Sheppard, A
Stoesser, N
German-Mesner, I
Vegesana, K
Walker, A
Crook, D
Mathers, A
author_sort Sheppard, A
collection OXFORD
description Much of the worldwide dissemination of antibiotic resistance has been driven by resistance gene associations with mobile genetic elements (MGEs), such as plasmids and transposons. Although increasing, our understanding of resistance spread remains relatively limited, as methods for tracking mobile resistance genes through multiple species, strains and plasmids are lacking. We have developed a bioinformatic pipeline for tracking variation within, and mobility of, specific transposable elements (TEs), such as transposons carrying antibiotic resistance genes. TETyper takes short-read whole-genome sequencing data as input and identifies single-nucleotide mutations and deletions within the TE of interest, to enable tracking of specific sequence variants, as well as the surrounding genetic context(s), to enable identification of transposition events. A major advantage of TETyper over previous methods is that it does not require a genome reference. To investigate global dissemination of Klebsiella pneumoniae carbapenemase (KPC) and its associated transposon Tn4401, we applied TETyper to a collection of >3000 publicly available Illumina datasets containing blaKPC. This revealed surprising diversity, with >200 distinct flanking genetic contexts for Tn4401, indicating high levels of transposition. Integration of sample metadata revealed insights into associations between geographic locations, host species, Tn4401 sequence variants and flanking genetic contexts. To demonstrate the ability of TETyper to cope with high copy number TEs and to track specific short-term evolutionary changes, we also applied it to the insertion sequence IS26 within a defined K. pneumoniae outbreak. TETyper is implemented in python and is freely available at https://github.com/aesheppard/TETyper.
first_indexed 2024-03-07T04:03:46Z
format Journal article
id oxford-uuid:c575b6f8-8c6b-4306-90c4-bdec3c1e0b22
institution University of Oxford
last_indexed 2024-03-07T04:03:46Z
publishDate 2018
publisher Microbiology Society
record_format dspace
spelling oxford-uuid:c575b6f8-8c6b-4306-90c4-bdec3c1e0b222022-03-27T06:31:06ZTETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing dataJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:c575b6f8-8c6b-4306-90c4-bdec3c1e0b22Symplectic Elements at OxfordMicrobiology Society2018Sheppard, AStoesser, NGerman-Mesner, IVegesana, KWalker, ACrook, DMathers, AMuch of the worldwide dissemination of antibiotic resistance has been driven by resistance gene associations with mobile genetic elements (MGEs), such as plasmids and transposons. Although increasing, our understanding of resistance spread remains relatively limited, as methods for tracking mobile resistance genes through multiple species, strains and plasmids are lacking. We have developed a bioinformatic pipeline for tracking variation within, and mobility of, specific transposable elements (TEs), such as transposons carrying antibiotic resistance genes. TETyper takes short-read whole-genome sequencing data as input and identifies single-nucleotide mutations and deletions within the TE of interest, to enable tracking of specific sequence variants, as well as the surrounding genetic context(s), to enable identification of transposition events. A major advantage of TETyper over previous methods is that it does not require a genome reference. To investigate global dissemination of Klebsiella pneumoniae carbapenemase (KPC) and its associated transposon Tn4401, we applied TETyper to a collection of >3000 publicly available Illumina datasets containing blaKPC. This revealed surprising diversity, with >200 distinct flanking genetic contexts for Tn4401, indicating high levels of transposition. Integration of sample metadata revealed insights into associations between geographic locations, host species, Tn4401 sequence variants and flanking genetic contexts. To demonstrate the ability of TETyper to cope with high copy number TEs and to track specific short-term evolutionary changes, we also applied it to the insertion sequence IS26 within a defined K. pneumoniae outbreak. TETyper is implemented in python and is freely available at https://github.com/aesheppard/TETyper.
spellingShingle Sheppard, A
Stoesser, N
German-Mesner, I
Vegesana, K
Walker, A
Crook, D
Mathers, A
TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title_full TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title_fullStr TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title_full_unstemmed TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title_short TETyper: a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short-read whole-genome sequencing data
title_sort tetyper a bioinformatic pipeline for classifying variation and genetic contexts of transposable elements from short read whole genome sequencing data
work_keys_str_mv AT shepparda tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT stoessern tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT germanmesneri tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT vegesanak tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT walkera tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT crookd tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata
AT mathersa tetyperabioinformaticpipelineforclassifyingvariationandgeneticcontextsoftransposableelementsfromshortreadwholegenomesequencingdata