Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)

Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimi...

Full description

Bibliographic Details
Main Authors: Ming-Ren Yang, Shun-Feng Su, Yu-Wei Wu
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-05-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2023.1054032/full
_version_ 1797814213969707008
author Ming-Ren Yang
Ming-Ren Yang
Shun-Feng Su
Yu-Wei Wu
Yu-Wei Wu
Yu-Wei Wu
author_facet Ming-Ren Yang
Ming-Ren Yang
Shun-Feng Su
Yu-Wei Wu
Yu-Wei Wu
Yu-Wei Wu
author_sort Ming-Ren Yang
collection DOAJ
description Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories “susceptible” or “resistant” but instead attempted to predict the MIC values using machine learning approaches.Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances.Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes.
first_indexed 2024-03-13T08:04:18Z
format Article
id doaj.art-782d5dad1e954cee85c3f9b419a1e086
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-03-13T08:04:18Z
publishDate 2023-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-782d5dad1e954cee85c3f9b419a1e0862023-06-01T09:08:56ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-05-011410.3389/fgene.2023.10540321054032Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)Ming-Ren Yang0Ming-Ren Yang1Shun-Feng Su2Yu-Wei Wu3Yu-Wei Wu4Yu-Wei Wu5Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, TaiwanDepartment of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanDepartment of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanGraduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, TaiwanClinical Big Data Research Center, Taipei Medical University Hospital, Taipei, TaiwanTMU Research Center for Digestive Medicine, Taipei Medical University, Taipei, TaiwanBackground: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories “susceptible” or “resistant” but instead attempted to predict the MIC values using machine learning approaches.Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances.Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes.https://www.frontiersin.org/articles/10.3389/fgene.2023.1054032/fullfeature selectionpan-genomeantimicrobial resistance (AMR)minimum inhibitory concentrationMICSalmonella enterica
spellingShingle Ming-Ren Yang
Ming-Ren Yang
Shun-Feng Su
Yu-Wei Wu
Yu-Wei Wu
Yu-Wei Wu
Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
Frontiers in Genetics
feature selection
pan-genome
antimicrobial resistance (AMR)
minimum inhibitory concentration
MIC
Salmonella enterica
title Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_full Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_fullStr Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_full_unstemmed Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_short Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_sort using bacterial pan genome based feature selection approach to improve the prediction of minimum inhibitory concentration mic
topic feature selection
pan-genome
antimicrobial resistance (AMR)
minimum inhibitory concentration
MIC
Salmonella enterica
url https://www.frontiersin.org/articles/10.3389/fgene.2023.1054032/full
work_keys_str_mv AT mingrenyang usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT mingrenyang usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT shunfengsu usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT yuweiwu usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT yuweiwu usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT yuweiwu usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic