RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
Cell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a cruc...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/23/20/12176 |
_version_ | 1797407155765116928 |
---|---|
author | Jose Cleydson F. Silva Marco Aurélio Ferreira Thales F. M. Carvalho Fabyano F. Silva Sabrina de A. Silveira Sergio H. Brommonschenkel Elizabeth P. B. Fontes |
author_facet | Jose Cleydson F. Silva Marco Aurélio Ferreira Thales F. M. Carvalho Fabyano F. Silva Sabrina de A. Silveira Sergio H. Brommonschenkel Elizabeth P. B. Fontes |
author_sort | Jose Cleydson F. Silva |
collection | DOAJ |
description | Cell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a crucial role in plant development and disease defense. Although RLPs and RLKs share a similar single-pass transmembrane configuration, RLPs harbor short divergent C-terminal regions instead of the conserved kinase domain of RLKs. This RLP receptor structural design precludes sequence comparison algorithms from being used for high-throughput predictions of the RLP family in plant genomes, as has been extensively performed for RLK superfamily predictions. Here, we developed the RLPredictiOme, implemented with machine learning models in combination with Bayesian inference, capable of predicting RLP subfamilies in plant genomes. The ML models were simultaneously trained using six types of features, along with three stages to distinguish RLPs from non-RLPs (NRLPs), RLPs from RLKs, and classify new subfamilies of RLPs in plants. The ML models achieved high accuracy, precision, sensitivity, and specificity for predicting RLPs with relatively high probability ranging from 0.79 to 0.99. The prediction of the method was assessed with three datasets, two of which contained leucine-rich repeats (LRR)-RLPs from Arabidopsis and rice, and the last one consisted of the complete set of previously described Arabidopsis RLPs. In these validation tests, more than 90% of known RLPs were correctly predicted via RLPredictiOme. In addition to predicting previously characterized RLPs, RLPredictiOme uncovered new RLP subfamilies in the Arabidopsis genome. These include probable lipid transfer (PLT)-RLP, plastocyanin-like-RLP, ring finger-RLP, glycosyl-hydrolase-RLP, and glycerophosphoryldiester phosphodiesterase (GDPD, GDPDL)-RLP subfamilies, yet to be characterized. Compared to the only Arabidopsis GDPDL-RLK, molecular evolution studies confirmed that the ectodomain of GDPDL-RLPs might have undergone a purifying selection with a predominance of synonymous substitutions. Expression analyses revealed that predicted GDPGL-RLPs display a basal expression level and respond to developmental and biotic signals. The results of these biological assays indicate that these subfamily members have maintained functional domains during evolution and may play relevant roles in development and plant defense. Therefore, RLPredictiOme provides a framework for genome-wide surveys of the RLP superfamily as a foundation to rationalize functional studies of surface receptors and their relationships with different biological processes. |
first_indexed | 2024-03-09T03:37:59Z |
format | Article |
id | doaj.art-f0108ab963864a08bdc0d485881626d4 |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-09T03:37:59Z |
publishDate | 2022-10-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-f0108ab963864a08bdc0d485881626d42023-12-03T14:46:44ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-10-0123201217610.3390/ijms232012176RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane ReceptorsJose Cleydson F. Silva0Marco Aurélio Ferreira1Thales F. M. Carvalho2Fabyano F. Silva3Sabrina de A. Silveira4Sergio H. Brommonschenkel5Elizabeth P. B. Fontes6National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Viçosa 36570-900, BrazilDepartament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilInstitute of Engineering, Science and Technology, Universidade Federal dos Vales do Jequitinhonha e Mucuri, Janaúba 39447-814, BrazilDepartament of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilDepartment of Computer Science, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilPlant Pathology Department/Bioagro, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilDepartament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilCell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a crucial role in plant development and disease defense. Although RLPs and RLKs share a similar single-pass transmembrane configuration, RLPs harbor short divergent C-terminal regions instead of the conserved kinase domain of RLKs. This RLP receptor structural design precludes sequence comparison algorithms from being used for high-throughput predictions of the RLP family in plant genomes, as has been extensively performed for RLK superfamily predictions. Here, we developed the RLPredictiOme, implemented with machine learning models in combination with Bayesian inference, capable of predicting RLP subfamilies in plant genomes. The ML models were simultaneously trained using six types of features, along with three stages to distinguish RLPs from non-RLPs (NRLPs), RLPs from RLKs, and classify new subfamilies of RLPs in plants. The ML models achieved high accuracy, precision, sensitivity, and specificity for predicting RLPs with relatively high probability ranging from 0.79 to 0.99. The prediction of the method was assessed with three datasets, two of which contained leucine-rich repeats (LRR)-RLPs from Arabidopsis and rice, and the last one consisted of the complete set of previously described Arabidopsis RLPs. In these validation tests, more than 90% of known RLPs were correctly predicted via RLPredictiOme. In addition to predicting previously characterized RLPs, RLPredictiOme uncovered new RLP subfamilies in the Arabidopsis genome. These include probable lipid transfer (PLT)-RLP, plastocyanin-like-RLP, ring finger-RLP, glycosyl-hydrolase-RLP, and glycerophosphoryldiester phosphodiesterase (GDPD, GDPDL)-RLP subfamilies, yet to be characterized. Compared to the only Arabidopsis GDPDL-RLK, molecular evolution studies confirmed that the ectodomain of GDPDL-RLPs might have undergone a purifying selection with a predominance of synonymous substitutions. Expression analyses revealed that predicted GDPGL-RLPs display a basal expression level and respond to developmental and biotic signals. The results of these biological assays indicate that these subfamily members have maintained functional domains during evolution and may play relevant roles in development and plant defense. Therefore, RLPredictiOme provides a framework for genome-wide surveys of the RLP superfamily as a foundation to rationalize functional studies of surface receptors and their relationships with different biological processes.https://www.mdpi.com/1422-0067/23/20/12176RLPredictiOmeprobable lipid transfer (PLT)-RLPplastocyanin-like-RLPring finger-RLPglycosyl-hydrolase-RLPglycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP |
spellingShingle | Jose Cleydson F. Silva Marco Aurélio Ferreira Thales F. M. Carvalho Fabyano F. Silva Sabrina de A. Silveira Sergio H. Brommonschenkel Elizabeth P. B. Fontes RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors International Journal of Molecular Sciences RLPredictiOme probable lipid transfer (PLT)-RLP plastocyanin-like-RLP ring finger-RLP glycosyl-hydrolase-RLP glycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP |
title | RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors |
title_full | RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors |
title_fullStr | RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors |
title_full_unstemmed | RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors |
title_short | RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors |
title_sort | rlpredictiome a machine learning derived method for high throughput prediction of plant receptor like proteins reveals novel classes of transmembrane receptors |
topic | RLPredictiOme probable lipid transfer (PLT)-RLP plastocyanin-like-RLP ring finger-RLP glycosyl-hydrolase-RLP glycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP |
url | https://www.mdpi.com/1422-0067/23/20/12176 |
work_keys_str_mv | AT josecleydsonfsilva rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT marcoaurelioferreira rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT thalesfmcarvalho rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT fabyanofsilva rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT sabrinadeasilveira rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT sergiohbrommonschenkel rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors AT elizabethpbfontes rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors |