DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-01-01
|
Series: | Plants |
Subjects: | |
Online Access: | https://www.mdpi.com/2223-7747/11/1/136 |
_version_ | 1797497912871092224 |
---|---|
author | Zhenya Liu Zirui Ren Lunyi Yan Feng Li |
author_facet | Zhenya Liu Zirui Ren Lunyi Yan Feng Li |
author_sort | Zhenya Liu |
collection | DOAJ |
description | Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors. |
first_indexed | 2024-03-10T03:25:56Z |
format | Article |
id | doaj.art-511f0361f69d4d3a87d0217064ce1725 |
institution | Directory Open Access Journal |
issn | 2223-7747 |
language | English |
last_indexed | 2024-03-10T03:25:56Z |
publishDate | 2022-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Plants |
spelling | doaj.art-511f0361f69d4d3a87d0217064ce17252023-11-23T12:08:13ZengMDPI AGPlants2223-77472022-01-0111113610.3390/plants11010136DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep LearningZhenya Liu0Zirui Ren1Lunyi Yan2Feng Li3Key Lab of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, ChinaCollege of Informatics, Huazhong Agricultural University, Wuhan 430070, ChinaCollege of Informatics, Huazhong Agricultural University, Wuhan 430070, ChinaKey Lab of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, ChinaMembers of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors.https://www.mdpi.com/2223-7747/11/1/136deep learningLRR domainplant disease-resistance genes |
spellingShingle | Zhenya Liu Zirui Ren Lunyi Yan Feng Li DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning Plants deep learning LRR domain plant disease-resistance genes |
title | DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning |
title_full | DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning |
title_fullStr | DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning |
title_full_unstemmed | DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning |
title_short | DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning |
title_sort | deeplrr an online webserver for leucine rich repeat containing protein characterization based on deep learning |
topic | deep learning LRR domain plant disease-resistance genes |
url | https://www.mdpi.com/2223-7747/11/1/136 |
work_keys_str_mv | AT zhenyaliu deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning AT ziruiren deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning AT lunyiyan deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning AT fengli deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning |