ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants

Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established gen...

Full description

Bibliographic Details
Main Authors: Prabina Kumar Meher, Tanmaya Kumar Sahu, Ajit Gupta, Anuj Kumar, Sachin Rustgi
Format: Article
Language:English
Published: Wiley 2024-03-01
Series:The Plant Genome
Online Access:https://doi.org/10.1002/tpg2.20259
Description
Summary:Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K‐mer compositional features were used as input. With ACC, K‐mer, and ACC + K‐mer compositional features, the overall accuracy of ∼60–77, ∼75–86, and ∼61–78% were respectively obtained using the SVM algorithm with fivefold cross‐validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross‐validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri‐sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.
ISSN:1940-3372