AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria
IntroductionAntimicrobial peptides (AMPs) are promising alternatives to traditional antibiotics for combating plant pathogenic bacteria in agriculture and the environment. However, identifying potent AMPs through laborious experimental assays is resource-intensive and time-consuming. To address thes...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2024-03-01
|
Series: | Frontiers in Microbiology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fmicb.2024.1304044/full |
_version_ | 1797271391712575488 |
---|---|
author | Jonathan Shao Jonathan Shao Yan Zhao Wei Wei Iosif I. Vaisman |
author_facet | Jonathan Shao Jonathan Shao Yan Zhao Wei Wei Iosif I. Vaisman |
author_sort | Jonathan Shao |
collection | DOAJ |
description | IntroductionAntimicrobial peptides (AMPs) are promising alternatives to traditional antibiotics for combating plant pathogenic bacteria in agriculture and the environment. However, identifying potent AMPs through laborious experimental assays is resource-intensive and time-consuming. To address these limitations, this study presents a bioinformatics approach utilizing machine learning models for predicting and selecting AMPs active against plant pathogenic bacteria.MethodsN-gram representations of peptide sequences with 3-letter and 9-letter reduced amino acid alphabets were used to capture the sequence patterns and motifs that contribute to the antimicrobial activity of AMPs. A 5-fold cross-validation technique was used to train the machine learning models and to evaluate their predictive accuracy and robustness.ResultsThe models were applied to predict putative AMPs encoded by intergenic regions and small open reading frames (ORFs) of the citrus genome. Approximately 7% of the 10,000-peptide dataset from the intergenic region and 7% of the 685,924-peptide dataset from the whole genome were predicted as probable AMPs. The prediction accuracy of the reported models range from 0.72 to 0.91. A subset of the predicted AMPs was selected for experimental test against Spiroplasma citri, the causative agent of citrus stubborn disease. The experimental results confirm the antimicrobial activity of the selected AMPs against the target bacterium, demonstrating the predictive capability of the machine learning models.DiscussionHydrophobic amino acid residues and positively charged amino acid residues are among the key features in predicting AMPs by the Random Forest Algorithm. Aggregation propensity appears to be correlated with the effectiveness of the AMPs. The described models would contribute to the development of effective AMP-based strategies for plant disease management in agricultural and environmental settings. To facilitate broader accessibility, our model is publicly available on the AGRAMP (Agricultural Ngrams Antimicrobial Peptides) server. |
first_indexed | 2024-03-07T14:03:01Z |
format | Article |
id | doaj.art-66415ee5ab084ebe9bd505d226b6b622 |
institution | Directory Open Access Journal |
issn | 1664-302X |
language | English |
last_indexed | 2024-03-07T14:03:01Z |
publishDate | 2024-03-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Microbiology |
spelling | doaj.art-66415ee5ab084ebe9bd505d226b6b6222024-03-07T05:02:43ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2024-03-011510.3389/fmicb.2024.13040441304044AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteriaJonathan Shao0Jonathan Shao1Yan Zhao2Wei Wei3Iosif I. Vaisman4Statistics and Bioinformatics Group - Northeast Area, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD, United StatesSchool of Systems Biology, George Mason University, Manassas, VA, United StatesMolecular Plant Pathology Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD, United StatesMolecular Plant Pathology Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD, United StatesSchool of Systems Biology, George Mason University, Manassas, VA, United StatesIntroductionAntimicrobial peptides (AMPs) are promising alternatives to traditional antibiotics for combating plant pathogenic bacteria in agriculture and the environment. However, identifying potent AMPs through laborious experimental assays is resource-intensive and time-consuming. To address these limitations, this study presents a bioinformatics approach utilizing machine learning models for predicting and selecting AMPs active against plant pathogenic bacteria.MethodsN-gram representations of peptide sequences with 3-letter and 9-letter reduced amino acid alphabets were used to capture the sequence patterns and motifs that contribute to the antimicrobial activity of AMPs. A 5-fold cross-validation technique was used to train the machine learning models and to evaluate their predictive accuracy and robustness.ResultsThe models were applied to predict putative AMPs encoded by intergenic regions and small open reading frames (ORFs) of the citrus genome. Approximately 7% of the 10,000-peptide dataset from the intergenic region and 7% of the 685,924-peptide dataset from the whole genome were predicted as probable AMPs. The prediction accuracy of the reported models range from 0.72 to 0.91. A subset of the predicted AMPs was selected for experimental test against Spiroplasma citri, the causative agent of citrus stubborn disease. The experimental results confirm the antimicrobial activity of the selected AMPs against the target bacterium, demonstrating the predictive capability of the machine learning models.DiscussionHydrophobic amino acid residues and positively charged amino acid residues are among the key features in predicting AMPs by the Random Forest Algorithm. Aggregation propensity appears to be correlated with the effectiveness of the AMPs. The described models would contribute to the development of effective AMP-based strategies for plant disease management in agricultural and environmental settings. To facilitate broader accessibility, our model is publicly available on the AGRAMP (Agricultural Ngrams Antimicrobial Peptides) server.https://www.frontiersin.org/articles/10.3389/fmicb.2024.1304044/fullantimicrobial peptideAGRAMPSpiroplasmaN-gramrandom forestAMP |
spellingShingle | Jonathan Shao Jonathan Shao Yan Zhao Wei Wei Iosif I. Vaisman AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria Frontiers in Microbiology antimicrobial peptide AGRAMP Spiroplasma N-gram random forest AMP |
title | AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
title_full | AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
title_fullStr | AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
title_full_unstemmed | AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
title_short | AGRAMP: machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
title_sort | agramp machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria |
topic | antimicrobial peptide AGRAMP Spiroplasma N-gram random forest AMP |
url | https://www.frontiersin.org/articles/10.3389/fmicb.2024.1304044/full |
work_keys_str_mv | AT jonathanshao agrampmachinelearningmodelsforpredictingantimicrobialpeptidesagainstphytopathogenicbacteria AT jonathanshao agrampmachinelearningmodelsforpredictingantimicrobialpeptidesagainstphytopathogenicbacteria AT yanzhao agrampmachinelearningmodelsforpredictingantimicrobialpeptidesagainstphytopathogenicbacteria AT weiwei agrampmachinelearningmodelsforpredictingantimicrobialpeptidesagainstphytopathogenicbacteria AT iosifivaisman agrampmachinelearningmodelsforpredictingantimicrobialpeptidesagainstphytopathogenicbacteria |