ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants
Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established gen...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2024-03-01
|
Series: | The Plant Genome |
Online Access: | https://doi.org/10.1002/tpg2.20259 |
_version_ | 1797253776595222528 |
---|---|
author | Prabina Kumar Meher Tanmaya Kumar Sahu Ajit Gupta Anuj Kumar Sachin Rustgi |
author_facet | Prabina Kumar Meher Tanmaya Kumar Sahu Ajit Gupta Anuj Kumar Sachin Rustgi |
author_sort | Prabina Kumar Meher |
collection | DOAJ |
description | Abstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K‐mer compositional features were used as input. With ACC, K‐mer, and ACC + K‐mer compositional features, the overall accuracy of ∼60–77, ∼75–86, and ∼61–78% were respectively obtained using the SVM algorithm with fivefold cross‐validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross‐validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri‐sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins. |
first_indexed | 2024-04-24T21:39:26Z |
format | Article |
id | doaj.art-71d4896f8de34ee99f05ce864ef20591 |
institution | Directory Open Access Journal |
issn | 1940-3372 |
language | English |
last_indexed | 2024-04-24T21:39:26Z |
publishDate | 2024-03-01 |
publisher | Wiley |
record_format | Article |
series | The Plant Genome |
spelling | doaj.art-71d4896f8de34ee99f05ce864ef205912024-03-21T11:34:18ZengWileyThe Plant Genome1940-33722024-03-01171n/an/a10.1002/tpg2.20259ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plantsPrabina Kumar Meher0Tanmaya Kumar Sahu1Ajit Gupta2Anuj Kumar3Sachin Rustgi4ICAR‐Indian Agricultural Statistics Research Institute New Delhi IndiaICAR‐National Bureau of Plant Genetic Resources New Delhi IndiaICAR‐Indian Agricultural Statistics Research Institute New Delhi IndiaDep. of Microbiology and Immunology Dalhousie Univ. Halifax Nova Scotia CanadaDep. of Plant and Environmental Sciences, Pee Dee Research and Education Centre Clemson Univ. Florence SC USAAbstract One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress‐responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K‐mer compositional features were used as input. With ACC, K‐mer, and ACC + K‐mer compositional features, the overall accuracy of ∼60–77, ∼75–86, and ∼61–78% were respectively obtained using the SVM algorithm with fivefold cross‐validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross‐validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri‐sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.https://doi.org/10.1002/tpg2.20259 |
spellingShingle | Prabina Kumar Meher Tanmaya Kumar Sahu Ajit Gupta Anuj Kumar Sachin Rustgi ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants The Plant Genome |
title | ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants |
title_full | ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants |
title_fullStr | ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants |
title_full_unstemmed | ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants |
title_short | ASRpro: A machine‐learning computational model for identifying proteins associated with multiple abiotic stress in plants |
title_sort | asrpro a machine learning computational model for identifying proteins associated with multiple abiotic stress in plants |
url | https://doi.org/10.1002/tpg2.20259 |
work_keys_str_mv | AT prabinakumarmeher asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants AT tanmayakumarsahu asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants AT ajitgupta asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants AT anujkumar asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants AT sachinrustgi asrproamachinelearningcomputationalmodelforidentifyingproteinsassociatedwithmultipleabioticstressinplants |