TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots

The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots...

Full description

Bibliographic Details
Main Authors:	Yunxiang Wang, Hong Zhang, Zhenchao Xu, Shouhua Zhang, Rui Guo
Format:	Article
Language:	English
Published:	AIMS Press 2023-10-01
Series:	Mathematical Biosciences and Engineering
Subjects:	rna secondary structure prediction pseudoknot vision transformer long-range interactions deep learning
Online Access:	https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML

_version_	1797627809987821568
author	Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo
author_facet	Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo
author_sort	Yunxiang Wang
collection	DOAJ
description	The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance.
first_indexed	2024-03-11T10:29:49Z
format	Article
id	doaj.art-89ecf8dd2ef24cfe80285e2db5143956
institution	Directory Open Access Journal
issn	1551-0018
language	English
last_indexed	2024-03-11T10:29:49Z
publishDate	2023-10-01
publisher	AIMS Press
record_format	Article
series	Mathematical Biosciences and Engineering
spelling	doaj.art-89ecf8dd2ef24cfe80285e2db51439562023-11-15T01:13:11ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-10-012011193201934010.3934/mbe.2023854TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknotsYunxiang Wang0Hong Zhang 1Zhenchao Xu2Shouhua Zhang3Rui Guo41. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China2. Information Technology and Electrical Engineering, University of Oulu, Oulu, Finland3. College of Life Sciences, Institute of Life Science and Green Development, Hebei University, Baoding, ChinaThe RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance.https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTMLrna secondary structure predictionpseudoknotvision transformerlong-range interactionsdeep learning
spellingShingle	Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots Mathematical Biosciences and Engineering rna secondary structure prediction pseudoknot vision transformer long-range interactions deep learning
title	TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_full	TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_fullStr	TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_full_unstemmed	TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_short	TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_sort	transufold unlocking the structural complexity of short and long rna with pseudoknots
topic	rna secondary structure prediction pseudoknot vision transformer long-range interactions deep learning
url	https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML
work_keys_str_mv	AT yunxiangwang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT hongzhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT zhenchaoxu transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT shouhuazhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT ruiguo transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots

TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots

Similar Items