TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots
The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2023-10-01
|
Series: | Mathematical Biosciences and Engineering |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML |
_version_ | 1797627809987821568 |
---|---|
author | Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo |
author_facet | Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo |
author_sort | Yunxiang Wang |
collection | DOAJ |
description | The RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance. |
first_indexed | 2024-03-11T10:29:49Z |
format | Article |
id | doaj.art-89ecf8dd2ef24cfe80285e2db5143956 |
institution | Directory Open Access Journal |
issn | 1551-0018 |
language | English |
last_indexed | 2024-03-11T10:29:49Z |
publishDate | 2023-10-01 |
publisher | AIMS Press |
record_format | Article |
series | Mathematical Biosciences and Engineering |
spelling | doaj.art-89ecf8dd2ef24cfe80285e2db51439562023-11-15T01:13:11ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-10-012011193201934010.3934/mbe.2023854TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknotsYunxiang Wang0Hong Zhang 1Zhenchao Xu2Shouhua Zhang3Rui Guo41. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei, China2. Information Technology and Electrical Engineering, University of Oulu, Oulu, Finland3. College of Life Sciences, Institute of Life Science and Green Development, Hebei University, Baoding, ChinaThe RNA secondary structure is like a blueprint that holds the key to unlocking the mysteries of RNA function and 3D structure. It serves as a crucial foundation for investigating the complex world of RNA, making it an indispensable component of research in this exciting field. However, pseudoknots cannot be accurately predicted by conventional prediction methods based on free energy minimization, which results in a performance bottleneck. To this end, we propose a deep learning-based method called TransUFold to train directly on RNA data annotated with structure information. It employs an encoder-decoder network architecture, named Vision Transformer, to extract long-range interactions in RNA sequences and utilizes convolutions with lateral connections to supplement short-range interactions. Then, a post-processing program is designed to constrain the model's output to produce realistic and effective RNA secondary structures, including pseudoknots. After training TransUFold on benchmark datasets, we outperform other methods in test data on the same family. Additionally, we achieve better results on longer sequences up to 1600 nt, demonstrating the outstanding performance of Vision Transformer in extracting long-range interactions in RNA sequences. Finally, our analysis indicates that TransUFold produces effective pseudoknot structures in long sequences. As more high-quality RNA structures become available, deep learning-based prediction methods like Vision Transformer can exhibit better performance.https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTMLrna secondary structure predictionpseudoknotvision transformerlong-range interactionsdeep learning |
spellingShingle | Yunxiang Wang Hong Zhang Zhenchao Xu Shouhua Zhang Rui Guo TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots Mathematical Biosciences and Engineering rna secondary structure prediction pseudoknot vision transformer long-range interactions deep learning |
title | TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots |
title_full | TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots |
title_fullStr | TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots |
title_full_unstemmed | TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots |
title_short | TransUFold: Unlocking the structural complexity of short and long RNA with pseudoknots |
title_sort | transufold unlocking the structural complexity of short and long rna with pseudoknots |
topic | rna secondary structure prediction pseudoknot vision transformer long-range interactions deep learning |
url | https://www.aimspress.com/article/doi/10.3934/mbe.2023854?viewType=HTML |
work_keys_str_mv | AT yunxiangwang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT hongzhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT zhenchaoxu transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT shouhuazhang transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots AT ruiguo transufoldunlockingthestructuralcomplexityofshortandlongrnawithpseudoknots |