DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle

While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseu...

Full description

Bibliographic Details
Main Authors: Linyu Wang, Yuanning Liu, Xiaodan Zhong, Haiming Liu, Chao Lu, Cong Li, Hao Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-03-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2019.00143/full
_version_ 1811274369303838720
author Linyu Wang
Linyu Wang
Yuanning Liu
Yuanning Liu
Xiaodan Zhong
Xiaodan Zhong
Xiaodan Zhong
Haiming Liu
Haiming Liu
Chao Lu
Chao Lu
Cong Li
Cong Li
Hao Zhang
Hao Zhang
author_facet Linyu Wang
Linyu Wang
Yuanning Liu
Yuanning Liu
Xiaodan Zhong
Xiaodan Zhong
Xiaodan Zhong
Haiming Liu
Haiming Liu
Chao Lu
Chao Lu
Cong Li
Cong Li
Hao Zhang
Hao Zhang
author_sort Linyu Wang
collection DOAJ
description While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, “DMfold,” a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.
first_indexed 2024-04-12T23:17:45Z
format Article
id doaj.art-ef6efcfe687d412a9362765b1e96cf47
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-12T23:17:45Z
publishDate 2019-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-ef6efcfe687d412a9362765b1e96cf472022-12-22T03:12:37ZengFrontiers Media S.A.Frontiers in Genetics1664-80212019-03-011010.3389/fgene.2019.00143440502DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization PrincipleLinyu Wang0Linyu Wang1Yuanning Liu2Yuanning Liu3Xiaodan Zhong4Xiaodan Zhong5Xiaodan Zhong6Haiming Liu7Haiming Liu8Chao Lu9Chao Lu10Cong Li11Cong Li12Hao Zhang13Hao Zhang14College of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaDepartment of Pediatric Oncology, The First Hospital of Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, ChinaWhile predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, “DMfold,” a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.https://www.frontiersin.org/article/10.3389/fgene.2019.00143/fullRNAsecondary structure predictionpseudoknotdeep learningmulti-sequence methodsingle-sequence method
spellingShingle Linyu Wang
Linyu Wang
Yuanning Liu
Yuanning Liu
Xiaodan Zhong
Xiaodan Zhong
Xiaodan Zhong
Haiming Liu
Haiming Liu
Chao Lu
Chao Lu
Cong Li
Cong Li
Hao Zhang
Hao Zhang
DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
Frontiers in Genetics
RNA
secondary structure prediction
pseudoknot
deep learning
multi-sequence method
single-sequence method
title DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_full DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_fullStr DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_full_unstemmed DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_short DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_sort dmfold a novel method to predict rna secondary structure with pseudoknots based on deep learning and improved base pair maximization principle
topic RNA
secondary structure prediction
pseudoknot
deep learning
multi-sequence method
single-sequence method
url https://www.frontiersin.org/article/10.3389/fgene.2019.00143/full
work_keys_str_mv AT linyuwang dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT linyuwang dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT yuanningliu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT yuanningliu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT xiaodanzhong dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT xiaodanzhong dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT xiaodanzhong dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT haimingliu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT haimingliu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT chaolu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT chaolu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT congli dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT congli dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT haozhang dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT haozhang dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple