RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors

Cell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a cruc...

Full description

Bibliographic Details
Main Authors: Jose Cleydson F. Silva, Marco Aurélio Ferreira, Thales F. M. Carvalho, Fabyano F. Silva, Sabrina de A. Silveira, Sergio H. Brommonschenkel, Elizabeth P. B. Fontes
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/23/20/12176
_version_ 1797407155765116928
author Jose Cleydson F. Silva
Marco Aurélio Ferreira
Thales F. M. Carvalho
Fabyano F. Silva
Sabrina de A. Silveira
Sergio H. Brommonschenkel
Elizabeth P. B. Fontes
author_facet Jose Cleydson F. Silva
Marco Aurélio Ferreira
Thales F. M. Carvalho
Fabyano F. Silva
Sabrina de A. Silveira
Sergio H. Brommonschenkel
Elizabeth P. B. Fontes
author_sort Jose Cleydson F. Silva
collection DOAJ
description Cell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a crucial role in plant development and disease defense. Although RLPs and RLKs share a similar single-pass transmembrane configuration, RLPs harbor short divergent C-terminal regions instead of the conserved kinase domain of RLKs. This RLP receptor structural design precludes sequence comparison algorithms from being used for high-throughput predictions of the RLP family in plant genomes, as has been extensively performed for RLK superfamily predictions. Here, we developed the RLPredictiOme, implemented with machine learning models in combination with Bayesian inference, capable of predicting RLP subfamilies in plant genomes. The ML models were simultaneously trained using six types of features, along with three stages to distinguish RLPs from non-RLPs (NRLPs), RLPs from RLKs, and classify new subfamilies of RLPs in plants. The ML models achieved high accuracy, precision, sensitivity, and specificity for predicting RLPs with relatively high probability ranging from 0.79 to 0.99. The prediction of the method was assessed with three datasets, two of which contained leucine-rich repeats (LRR)-RLPs from Arabidopsis and rice, and the last one consisted of the complete set of previously described Arabidopsis RLPs. In these validation tests, more than 90% of known RLPs were correctly predicted via RLPredictiOme. In addition to predicting previously characterized RLPs, RLPredictiOme uncovered new RLP subfamilies in the Arabidopsis genome. These include probable lipid transfer (PLT)-RLP, plastocyanin-like-RLP, ring finger-RLP, glycosyl-hydrolase-RLP, and glycerophosphoryldiester phosphodiesterase (GDPD, GDPDL)-RLP subfamilies, yet to be characterized. Compared to the only Arabidopsis GDPDL-RLK, molecular evolution studies confirmed that the ectodomain of GDPDL-RLPs might have undergone a purifying selection with a predominance of synonymous substitutions. Expression analyses revealed that predicted GDPGL-RLPs display a basal expression level and respond to developmental and biotic signals. The results of these biological assays indicate that these subfamily members have maintained functional domains during evolution and may play relevant roles in development and plant defense. Therefore, RLPredictiOme provides a framework for genome-wide surveys of the RLP superfamily as a foundation to rationalize functional studies of surface receptors and their relationships with different biological processes.
first_indexed 2024-03-09T03:37:59Z
format Article
id doaj.art-f0108ab963864a08bdc0d485881626d4
institution Directory Open Access Journal
issn 1661-6596
1422-0067
language English
last_indexed 2024-03-09T03:37:59Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-f0108ab963864a08bdc0d485881626d42023-12-03T14:46:44ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-10-0123201217610.3390/ijms232012176RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane ReceptorsJose Cleydson F. Silva0Marco Aurélio Ferreira1Thales F. M. Carvalho2Fabyano F. Silva3Sabrina de A. Silveira4Sergio H. Brommonschenkel5Elizabeth P. B. Fontes6National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Viçosa 36570-900, BrazilDepartament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilInstitute of Engineering, Science and Technology, Universidade Federal dos Vales do Jequitinhonha e Mucuri, Janaúba 39447-814, BrazilDepartament of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilDepartment of Computer Science, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilPlant Pathology Department/Bioagro, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilDepartament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, BrazilCell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a crucial role in plant development and disease defense. Although RLPs and RLKs share a similar single-pass transmembrane configuration, RLPs harbor short divergent C-terminal regions instead of the conserved kinase domain of RLKs. This RLP receptor structural design precludes sequence comparison algorithms from being used for high-throughput predictions of the RLP family in plant genomes, as has been extensively performed for RLK superfamily predictions. Here, we developed the RLPredictiOme, implemented with machine learning models in combination with Bayesian inference, capable of predicting RLP subfamilies in plant genomes. The ML models were simultaneously trained using six types of features, along with three stages to distinguish RLPs from non-RLPs (NRLPs), RLPs from RLKs, and classify new subfamilies of RLPs in plants. The ML models achieved high accuracy, precision, sensitivity, and specificity for predicting RLPs with relatively high probability ranging from 0.79 to 0.99. The prediction of the method was assessed with three datasets, two of which contained leucine-rich repeats (LRR)-RLPs from Arabidopsis and rice, and the last one consisted of the complete set of previously described Arabidopsis RLPs. In these validation tests, more than 90% of known RLPs were correctly predicted via RLPredictiOme. In addition to predicting previously characterized RLPs, RLPredictiOme uncovered new RLP subfamilies in the Arabidopsis genome. These include probable lipid transfer (PLT)-RLP, plastocyanin-like-RLP, ring finger-RLP, glycosyl-hydrolase-RLP, and glycerophosphoryldiester phosphodiesterase (GDPD, GDPDL)-RLP subfamilies, yet to be characterized. Compared to the only Arabidopsis GDPDL-RLK, molecular evolution studies confirmed that the ectodomain of GDPDL-RLPs might have undergone a purifying selection with a predominance of synonymous substitutions. Expression analyses revealed that predicted GDPGL-RLPs display a basal expression level and respond to developmental and biotic signals. The results of these biological assays indicate that these subfamily members have maintained functional domains during evolution and may play relevant roles in development and plant defense. Therefore, RLPredictiOme provides a framework for genome-wide surveys of the RLP superfamily as a foundation to rationalize functional studies of surface receptors and their relationships with different biological processes.https://www.mdpi.com/1422-0067/23/20/12176RLPredictiOmeprobable lipid transfer (PLT)-RLPplastocyanin-like-RLPring finger-RLPglycosyl-hydrolase-RLPglycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP
spellingShingle Jose Cleydson F. Silva
Marco Aurélio Ferreira
Thales F. M. Carvalho
Fabyano F. Silva
Sabrina de A. Silveira
Sergio H. Brommonschenkel
Elizabeth P. B. Fontes
RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
International Journal of Molecular Sciences
RLPredictiOme
probable lipid transfer (PLT)-RLP
plastocyanin-like-RLP
ring finger-RLP
glycosyl-hydrolase-RLP
glycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP
title RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
title_full RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
title_fullStr RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
title_full_unstemmed RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
title_short RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors
title_sort rlpredictiome a machine learning derived method for high throughput prediction of plant receptor like proteins reveals novel classes of transmembrane receptors
topic RLPredictiOme
probable lipid transfer (PLT)-RLP
plastocyanin-like-RLP
ring finger-RLP
glycosyl-hydrolase-RLP
glycerophosphoryldiester phosphodiesterase (GDPD GDPDL)-RLP
url https://www.mdpi.com/1422-0067/23/20/12176
work_keys_str_mv AT josecleydsonfsilva rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT marcoaurelioferreira rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT thalesfmcarvalho rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT fabyanofsilva rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT sabrinadeasilveira rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT sergiohbrommonschenkel rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors
AT elizabethpbfontes rlpredictiomeamachinelearningderivedmethodforhighthroughputpredictionofplantreceptorlikeproteinsrevealsnovelclassesoftransmembranereceptors