ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants

Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established gen...

Full description

Bibliographic Details
Main Authors: Prabina Kumar Meher, Tanmaya Kumar Sahu, Ajit Gupta, Anuj Kumar, Sachin Rustgi
Format: Article
Language:English
Published: Wiley 2024-03-01
Series:The Plant Genome
Online Access:https://doi.org/10.1002/tpg2.20259
_version_ 1797253776595222528
author Prabina Kumar Meher
Tanmaya Kumar Sahu
Ajit Gupta
Anuj Kumar
Sachin Rustgi
author_facet Prabina Kumar Meher
Tanmaya Kumar Sahu
Ajit Gupta
Anuj Kumar
Sachin Rustgi
author_sort Prabina Kumar Meher
collection DOAJ
description Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K‐mer compositional features were used as input. With ACC, K‐mer, and ACC + K‐mer compositional features, the overall accuracy of ∼60–77, ∼75–86, and ∼61–78% were respectively obtained using the SVM algorithm with fivefold cross‐validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross‐validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri‐sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.
first_indexed 2024-04-24T21:39:26Z
format Article
id doaj.art-71d4896f8de34ee99f05ce864ef20591
institution Directory Open Access Journal
issn 1940-3372
language English
last_indexed 2024-04-24T21:39:26Z
publishDate 2024-03-01
publisher Wiley
record_format Article
series The Plant Genome
spelling doaj.art-71d4896f8de34ee99f05ce864ef205912024-03-21T11:34:18ZengWileyThe Plant Genome1940-33722024-03-01171n/an/a10.1002/tpg2.20259ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plantsPrabina Kumar Meher0Tanmaya Kumar Sahu1Ajit Gupta2Anuj Kumar3Sachin Rustgi4ICAR‐Indian Agricultural Statistics Research Institute New Delhi IndiaICAR‐National Bureau of Plant Genetic Resources New Delhi IndiaICAR‐Indian Agricultural Statistics Research Institute New Delhi IndiaDep. of Microbiology and Immunology Dalhousie Univ. Halifax Nova Scotia CanadaDep. of Plant and Environmental Sciences, Pee Dee Research and Education Centre Clemson Univ. Florence SC USAAbstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K‐mer compositional features were used as input. With ACC, K‐mer, and ACC + K‐mer compositional features, the overall accuracy of ∼60–77, ∼75–86, and ∼61–78% were respectively obtained using the SVM algorithm with fivefold cross‐validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross‐validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri‐sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.https://doi.org/10.1002/tpg2.20259
spellingShingle Prabina Kumar Meher
Tanmaya Kumar Sahu
Ajit Gupta
Anuj Kumar
Sachin Rustgi
ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
The Plant Genome
title ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
title_full ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
title_fullStr ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
title_full_unstemmed ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
title_short ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
title_sort asrpro a machine learning computational model for identifying proteins associated with multiple abiotic stress in plants
url https://doi.org/10.1002/tpg2.20259
work_keys_str_mv AT prabinakumarmeher asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants
AT tanmayakumarsahu asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants
AT ajitgupta asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants
AT anujkumar asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants
AT sachinrustgi asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants