Automated design of dynamic programming schemes for RNA folding with pseudoknots

Abstract Although RNA secondary structure prediction is a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, it remains challenging whenever pseudoknots come into play. Since the prediction of pseudoknotted structures by minimizing (realistically modelled) e...

Full description

Bibliographic Details
Main Authors: Bertrand Marchand, Sebastian Will, Sarah J. Berkemer, Yann Ponty, Laurent Bulteau
Format: Article
Language:English
Published: BMC 2023-12-01
Series:Algorithms for Molecular Biology
Subjects:
Online Access:https://doi.org/10.1186/s13015-023-00229-z
_version_ 1827604390304808960
author Bertrand Marchand
Sebastian Will
Sarah J. Berkemer
Yann Ponty
Laurent Bulteau
author_facet Bertrand Marchand
Sebastian Will
Sarah J. Berkemer
Yann Ponty
Laurent Bulteau
author_sort Bertrand Marchand
collection DOAJ
description Abstract Although RNA secondary structure prediction is a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, it remains challenging whenever pseudoknots come into play. Since the prediction of pseudoknotted structures by minimizing (realistically modelled) energy is NP-hard, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. To achieve good performance, these methods rely on specific and carefully hand-crafted DP schemes. In contrast, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. For this purpose, we formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the treewidth tw of the fatgraph, and its output represents a $${\mathcal {O}}(n^{tw+1})$$ O ( n t w + 1 ) algorithm (and even possibly $${\mathcal {O}}(n^{tw})$$ O ( n tw ) in simple energy models) for predicting the MFE folding of an RNA of length n. We demonstrate, for the most common pseudoknot classes, that our automatically generated algorithms achieve the same complexities as reported in the literature for hand-crafted schemes. Our framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case.
first_indexed 2024-03-09T05:58:22Z
format Article
id doaj.art-7951f3a7c5444984b8602ca59e3b8304
institution Directory Open Access Journal
issn 1748-7188
language English
last_indexed 2024-03-09T05:58:22Z
publishDate 2023-12-01
publisher BMC
record_format Article
series Algorithms for Molecular Biology
spelling doaj.art-7951f3a7c5444984b8602ca59e3b83042023-12-03T12:11:45ZengBMCAlgorithms for Molecular Biology1748-71882023-12-0118112210.1186/s13015-023-00229-zAutomated design of dynamic programming schemes for RNA folding with pseudoknotsBertrand Marchand0Sebastian Will1Sarah J. Berkemer2Yann Ponty3Laurent Bulteau4LIX (UMR 7161), Ecole Polytechnique, Institut Polytechnique de ParisLIX (UMR 7161), Ecole Polytechnique, Institut Polytechnique de ParisLIX (UMR 7161), Ecole Polytechnique, Institut Polytechnique de ParisLIX (UMR 7161), Ecole Polytechnique, Institut Polytechnique de ParisLIGM, CNRS, University Gustave EiffelAbstract Although RNA secondary structure prediction is a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, it remains challenging whenever pseudoknots come into play. Since the prediction of pseudoknotted structures by minimizing (realistically modelled) energy is NP-hard, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. To achieve good performance, these methods rely on specific and carefully hand-crafted DP schemes. In contrast, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. For this purpose, we formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the treewidth tw of the fatgraph, and its output represents a $${\mathcal {O}}(n^{tw+1})$$ O ( n t w + 1 ) algorithm (and even possibly $${\mathcal {O}}(n^{tw})$$ O ( n tw ) in simple energy models) for predicting the MFE folding of an RNA of length n. We demonstrate, for the most common pseudoknot classes, that our automatically generated algorithms achieve the same complexities as reported in the literature for hand-crafted schemes. Our framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case.https://doi.org/10.1186/s13015-023-00229-zPseudoknotsRNA foldingTree DecompositionTreewidth
spellingShingle Bertrand Marchand
Sebastian Will
Sarah J. Berkemer
Yann Ponty
Laurent Bulteau
Automated design of dynamic programming schemes for RNA folding with pseudoknots
Algorithms for Molecular Biology
Pseudoknots
RNA folding
Tree Decomposition
Treewidth
title Automated design of dynamic programming schemes for RNA folding with pseudoknots
title_full Automated design of dynamic programming schemes for RNA folding with pseudoknots
title_fullStr Automated design of dynamic programming schemes for RNA folding with pseudoknots
title_full_unstemmed Automated design of dynamic programming schemes for RNA folding with pseudoknots
title_short Automated design of dynamic programming schemes for RNA folding with pseudoknots
title_sort automated design of dynamic programming schemes for rna folding with pseudoknots
topic Pseudoknots
RNA folding
Tree Decomposition
Treewidth
url https://doi.org/10.1186/s13015-023-00229-z
work_keys_str_mv AT bertrandmarchand automateddesignofdynamicprogrammingschemesforrnafoldingwithpseudoknots
AT sebastianwill automateddesignofdynamicprogrammingschemesforrnafoldingwithpseudoknots
AT sarahjberkemer automateddesignofdynamicprogrammingschemesforrnafoldingwithpseudoknots
AT yannponty automateddesignofdynamicprogrammingschemesforrnafoldingwithpseudoknots
AT laurentbulteau automateddesignofdynamicprogrammingschemesforrnafoldingwithpseudoknots