Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales

Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlif...

Full description

Bibliographic Details
Main Authors: Noah Legall, Liliana C. M. Salvador
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-09-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/full
_version_ 1798037483364024320
author Noah Legall
Noah Legall
Noah Legall
Liliana C. M. Salvador
Liliana C. M. Salvador
Liliana C. M. Salvador
author_facet Noah Legall
Noah Legall
Noah Legall
Liliana C. M. Salvador
Liliana C. M. Salvador
Liliana C. M. Salvador
author_sort Noah Legall
collection DOAJ
description Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions.
first_indexed 2024-04-11T21:27:09Z
format Article
id doaj.art-d5b5e13310344c969686f1a674c39cff
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-04-11T21:27:09Z
publishDate 2022-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-d5b5e13310344c969686f1a674c39cff2022-12-22T04:02:21ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2022-09-011310.3389/fmicb.2022.787856787856Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scalesNoah Legall0Noah Legall1Noah Legall2Liliana C. M. Salvador3Liliana C. M. Salvador4Liliana C. M. Salvador5Interdisciplinary Disease Ecology Across Scales Research Traineeship Program, University of Georgia, Athens, GA, United StatesInstitute of Bioinformatics, University of Georgia, Athens, GA, United StatesCenter for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United StatesInstitute of Bioinformatics, University of Georgia, Athens, GA, United StatesCenter for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United StatesDepartment of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, GA, United StatesMycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions.https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/fullcomparative genomicsMycobacterium bovishost rangegeographic locationpopulation clustersselective sweeps
spellingShingle Noah Legall
Noah Legall
Noah Legall
Liliana C. M. Salvador
Liliana C. M. Salvador
Liliana C. M. Salvador
Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
Frontiers in Microbiology
comparative genomics
Mycobacterium bovis
host range
geographic location
population clusters
selective sweeps
title Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
title_full Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
title_fullStr Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
title_full_unstemmed Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
title_short Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
title_sort selective sweep sites and snp dense regions differentiate mycobacterium bovis isolates across scales
topic comparative genomics
Mycobacterium bovis
host range
geographic location
population clusters
selective sweeps
url https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/full
work_keys_str_mv AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales
AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales
AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales
AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales
AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales
AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales