Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales
Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlif...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-09-01
|
Series: | Frontiers in Microbiology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/full |
_version_ | 1798037483364024320 |
---|---|
author | Noah Legall Noah Legall Noah Legall Liliana C. M. Salvador Liliana C. M. Salvador Liliana C. M. Salvador |
author_facet | Noah Legall Noah Legall Noah Legall Liliana C. M. Salvador Liliana C. M. Salvador Liliana C. M. Salvador |
author_sort | Noah Legall |
collection | DOAJ |
description | Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions. |
first_indexed | 2024-04-11T21:27:09Z |
format | Article |
id | doaj.art-d5b5e13310344c969686f1a674c39cff |
institution | Directory Open Access Journal |
issn | 1664-302X |
language | English |
last_indexed | 2024-04-11T21:27:09Z |
publishDate | 2022-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Microbiology |
spelling | doaj.art-d5b5e13310344c969686f1a674c39cff2022-12-22T04:02:21ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2022-09-011310.3389/fmicb.2022.787856787856Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scalesNoah Legall0Noah Legall1Noah Legall2Liliana C. M. Salvador3Liliana C. M. Salvador4Liliana C. M. Salvador5Interdisciplinary Disease Ecology Across Scales Research Traineeship Program, University of Georgia, Athens, GA, United StatesInstitute of Bioinformatics, University of Georgia, Athens, GA, United StatesCenter for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United StatesInstitute of Bioinformatics, University of Georgia, Athens, GA, United StatesCenter for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United StatesDepartment of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, GA, United StatesMycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions.https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/fullcomparative genomicsMycobacterium bovishost rangegeographic locationpopulation clustersselective sweeps |
spellingShingle | Noah Legall Noah Legall Noah Legall Liliana C. M. Salvador Liliana C. M. Salvador Liliana C. M. Salvador Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales Frontiers in Microbiology comparative genomics Mycobacterium bovis host range geographic location population clusters selective sweeps |
title | Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales |
title_full | Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales |
title_fullStr | Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales |
title_full_unstemmed | Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales |
title_short | Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales |
title_sort | selective sweep sites and snp dense regions differentiate mycobacterium bovis isolates across scales |
topic | comparative genomics Mycobacterium bovis host range geographic location population clusters selective sweeps |
url | https://www.frontiersin.org/articles/10.3389/fmicb.2022.787856/full |
work_keys_str_mv | AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales AT noahlegall selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales AT lilianacmsalvador selectivesweepsitesandsnpdenseregionsdifferentiatemycobacteriumbovisisolatesacrossscales |