DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning

Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods...

Full description

Bibliographic Details
Main Authors: Zhenya Liu, Zirui Ren, Lunyi Yan, Feng Li
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Plants
Subjects:
Online Access:https://www.mdpi.com/2223-7747/11/1/136
_version_ 1797497912871092224
author Zhenya Liu
Zirui Ren
Lunyi Yan
Feng Li
author_facet Zhenya Liu
Zirui Ren
Lunyi Yan
Feng Li
author_sort Zhenya Liu
collection DOAJ
description Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors.
first_indexed 2024-03-10T03:25:56Z
format Article
id doaj.art-511f0361f69d4d3a87d0217064ce1725
institution Directory Open Access Journal
issn 2223-7747
language English
last_indexed 2024-03-10T03:25:56Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Plants
spelling doaj.art-511f0361f69d4d3a87d0217064ce17252023-11-23T12:08:13ZengMDPI AGPlants2223-77472022-01-0111113610.3390/plants11010136DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep LearningZhenya Liu0Zirui Ren1Lunyi Yan2Feng Li3Key Lab of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, ChinaCollege of Informatics, Huazhong Agricultural University, Wuhan 430070, ChinaCollege of Informatics, Huazhong Agricultural University, Wuhan 430070, ChinaKey Lab of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, ChinaMembers of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors.https://www.mdpi.com/2223-7747/11/1/136deep learningLRR domainplant disease-resistance genes
spellingShingle Zhenya Liu
Zirui Ren
Lunyi Yan
Feng Li
DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
Plants
deep learning
LRR domain
plant disease-resistance genes
title DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_full DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_fullStr DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_full_unstemmed DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_short DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_sort deeplrr an online webserver for leucine rich repeat containing protein characterization based on deep learning
topic deep learning
LRR domain
plant disease-resistance genes
url https://www.mdpi.com/2223-7747/11/1/136
work_keys_str_mv AT zhenyaliu deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT ziruiren deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT lunyiyan deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT fengli deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning