Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance

Bacterial plasmids are extra-chromosomal genetic elements, which can act as efficient vectors of antibiotic resistance. Epidemiological insight into plasmids may be gained by applying plasmid typing schemes, which exploit loci involved in replication and mobility functions (replicon and MOB typing,...

Full description

Bibliographic Details
Main Author: Orlek, A
Other Authors: Walker, A
Format: Thesis
Language:English
Published: 2020
Subjects:
_version_ 1826310551824760832
author Orlek, A
author2 Walker, A
author_facet Walker, A
Orlek, A
author_sort Orlek, A
collection OXFORD
description Bacterial plasmids are extra-chromosomal genetic elements, which can act as efficient vectors of antibiotic resistance. Epidemiological insight into plasmids may be gained by applying plasmid typing schemes, which exploit loci involved in replication and mobility functions (replicon and MOB typing, respectively). In Chapter 2, I compiled a curated dataset of complete NCBI plasmids to assess the performance of in silico replicon and MOB typing in terms of concordance and ‘typeability’ (proportion of plasmids typed). I found a degree of non-concordance between the schemes, which was attributed to either ambiguous boundaries between MOBP/MOBQ types, or the mosaic nature of some plasmid genomes. Ultimately, I showed that the schemes fail to accommodate the diversity of plasmid genomes; of ~14000 curated bacterial plasmids, only 42% and 55% could be assigned a replicon and MOB type, respectively. Given the limitations of plasmid typing, I subsequently focused on whole genome sequencing (WGS) analysis approaches capitalising on the wider plasmid genome. High-throughput DNA sequencing has produced 1000s of bacterial WGS datasets. However, such datasets commonly comprise short sequencing reads, which yield fragmented assemblies; this makes comparative analysis of plasmid genomes challenging. In Chapter 3, I developed two methods for comparative plasmid analysis, which cluster short-read sequenced samples according to 1) plasmid replicon types; 2) sample-vs-reference plasmid distance score profiles. However, benchmarking suggested neither method is completely reliable. The rise of long-read sequencing technology has increased the availability of complete plasmid assemblies, facilitating comparative plasmid genomic analyses. Nevertheless, available alignment-based comparative genomic tools have limitations: they often do not provide metrics on structural similarity and lack flexibility in terms of input/output options. Therefore, in Chapter 4, I developed a novel alignment-based tool (‘ATCG’) for calculating pairwise average nucleotide identity (ANI), coverage breadth, and structural similarity, while addressing limitations of existing alignment-based tools. Benchmarking demonstrated favourable runtimes and supported the validity of calculated ANI scores. In Chapter 5, besides curating an updated plasmid dataset, I curated sample metadata (e.g. isolation source, geography). Using this metadata and plasmid biological features, I conducted multivariate statistical analyses to determine factors associated with plasmid resistance gene carriage, analysed across major resistance gene classes. The analysis yielded interesting findings, for example, demonstrating that patterns of plasmid antibiotic resistance carriage in livestock and humans reflect known antibiotic usage.
first_indexed 2024-03-07T07:55:09Z
format Thesis
id oxford-uuid:dc85a536-9b29-4695-81b2-58a88cb947da
institution University of Oxford
language English
last_indexed 2024-03-07T07:55:09Z
publishDate 2020
record_format dspace
spelling oxford-uuid:dc85a536-9b29-4695-81b2-58a88cb947da2023-08-08T09:20:30ZUsing bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistanceThesishttp://purl.org/coar/resource_type/c_db06uuid:dc85a536-9b29-4695-81b2-58a88cb947daComputational biologyMolecular epidemiologyPlasmidsEpidemiology--Statistical methodsDrug resistance in microorganismsMicrobial genomicsEnglishHyrax Deposit2020Orlek, AWalker, ASheppard, AAnjum, MPhan, HDoumith, MPeto, TBacterial plasmids are extra-chromosomal genetic elements, which can act as efficient vectors of antibiotic resistance. Epidemiological insight into plasmids may be gained by applying plasmid typing schemes, which exploit loci involved in replication and mobility functions (replicon and MOB typing, respectively). In Chapter 2, I compiled a curated dataset of complete NCBI plasmids to assess the performance of in silico replicon and MOB typing in terms of concordance and ‘typeability’ (proportion of plasmids typed). I found a degree of non-concordance between the schemes, which was attributed to either ambiguous boundaries between MOBP/MOBQ types, or the mosaic nature of some plasmid genomes. Ultimately, I showed that the schemes fail to accommodate the diversity of plasmid genomes; of ~14000 curated bacterial plasmids, only 42% and 55% could be assigned a replicon and MOB type, respectively. Given the limitations of plasmid typing, I subsequently focused on whole genome sequencing (WGS) analysis approaches capitalising on the wider plasmid genome. High-throughput DNA sequencing has produced 1000s of bacterial WGS datasets. However, such datasets commonly comprise short sequencing reads, which yield fragmented assemblies; this makes comparative analysis of plasmid genomes challenging. In Chapter 3, I developed two methods for comparative plasmid analysis, which cluster short-read sequenced samples according to 1) plasmid replicon types; 2) sample-vs-reference plasmid distance score profiles. However, benchmarking suggested neither method is completely reliable. The rise of long-read sequencing technology has increased the availability of complete plasmid assemblies, facilitating comparative plasmid genomic analyses. Nevertheless, available alignment-based comparative genomic tools have limitations: they often do not provide metrics on structural similarity and lack flexibility in terms of input/output options. Therefore, in Chapter 4, I developed a novel alignment-based tool (‘ATCG’) for calculating pairwise average nucleotide identity (ANI), coverage breadth, and structural similarity, while addressing limitations of existing alignment-based tools. Benchmarking demonstrated favourable runtimes and supported the validity of calculated ANI scores. In Chapter 5, besides curating an updated plasmid dataset, I curated sample metadata (e.g. isolation source, geography). Using this metadata and plasmid biological features, I conducted multivariate statistical analyses to determine factors associated with plasmid resistance gene carriage, analysed across major resistance gene classes. The analysis yielded interesting findings, for example, demonstrating that patterns of plasmid antibiotic resistance carriage in livestock and humans reflect known antibiotic usage.
spellingShingle Computational biology
Molecular epidemiology
Plasmids
Epidemiology--Statistical methods
Drug resistance in microorganisms
Microbial genomics
Orlek, A
Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title_full Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title_fullStr Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title_full_unstemmed Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title_short Using bacterial DNA sequencing data to investigate the epidemiology of plasmid-mediated antibiotic resistance
title_sort using bacterial dna sequencing data to investigate the epidemiology of plasmid mediated antibiotic resistance
topic Computational biology
Molecular epidemiology
Plasmids
Epidemiology--Statistical methods
Drug resistance in microorganisms
Microbial genomics
work_keys_str_mv AT orleka usingbacterialdnasequencingdatatoinvestigatetheepidemiologyofplasmidmediatedantibioticresistance