TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots

The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots...

Full description

Bibliographic Details
Main Authors: Yunxiang Wang, Hong Zhang, Zhenchao Xu, Shouhua Zhang, Rui Guo
Format: Article
Language:English
Published: AIMS Press 2023-10-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML
_version_ 1797627809987821568
author Yunxiang Wang
Hong Zhang
Zhenchao Xu
Shouhua Zhang
Rui Guo
author_facet Yunxiang Wang
Hong Zhang
Zhenchao Xu
Shouhua Zhang
Rui Guo
author_sort Yunxiang Wang
collection DOAJ
description The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance.
first_indexed 2024-03-11T10:29:49Z
format Article
id doaj.art-89ecf8dd2ef24cfe80285e2db5143956
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-03-11T10:29:49Z
publishDate 2023-10-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-89ecf8dd2ef24cfe80285e2db51439562023-11-15T01:13:11ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-10-012011193201934010.3934/mbe.2023854TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknotsYunxiang Wang0Hong Zhang 1Zhenchao Xu2Shouhua Zhang3Rui Guo41. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China2. Information Technology and Electrical Engineering, University of Oulu, Oulu, Finland3. College of Life Sciences, Institute of Life Science and Green Development, Hebei University, Baoding, ChinaThe RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance.https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTMLrna secondary structure predictionpseudoknotvision transformerlong-range interactionsdeep learning
spellingShingle Yunxiang Wang
Hong Zhang
Zhenchao Xu
Shouhua Zhang
Rui Guo
TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
Mathematical Biosciences and Engineering
rna secondary structure prediction
pseudoknot
vision transformer
long-range interactions
deep learning
title TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_full TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_fullStr TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_full_unstemmed TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_short TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
title_sort transufold unlocking the structural complexity of short and long rna with pseudoknots
topic rna secondary structure prediction
pseudoknot
vision transformer
long-range interactions
deep learning
url https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML
work_keys_str_mv AT yunxiangwang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots
AT hongzhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots
AT zhenchaoxu transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots
AT shouhuazhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots
AT ruiguo transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots