In this paper, we introduce a combinatorial framework that provides an interpretation of RNA pseudoknot structures as sampling paths of a Markov process. Our results facilitate a variety of applications ranging from the energy-based sampling of pseudoknot structures as well as the ab initio folding via hidden Markov models. Our main result is an algorithm that generates RNA pseudoknot structures with uniform probability. This algorithm serves as a steppingstone to sequence-specific as well as energy-based transition probabilities. The approach employs a correspondence between pseudoknot structures, parametrized in terms of the maximal number of mutually crossing arcs and certain tableau sequences. The latter can be viewed as lattice paths. The main idea of this paper is to view each such lattice path as a sampling path of a stochastic process and to make use of D-finiteness for the efficient computation of the corresponding transition probabilities.
In this article we study canonical γ-structures, a class of RNA pseudoknot structures that plays a key role in the context of polynomial time folding of RNA pseudoknot structures. A γ-structure is composed of specific building blocks that have topological genus less than or equal to γ, where composition means concatenation and nesting of such blocks. Our main result is the derivation of the generating function of γ-structures via symbolic enumeration using so called irreducible shadows. We furthermore recursively compute the generating polynomials of irreducible shadows of genus ≤ γ. The γ-structures are constructed via γ-matchings. For 1 ≤ γ ≤ 10, we compute Puiseux expansions at the unique, dominant singularities, allowing us to derive simple asymptotic formulas for the number of γ-structures.
Recently Yoffe et al. observed that the average distances between 5 ′ -3 ′ ends of RNA molecules are very small and largely independent of sequence length. This observation is based on numerical computations as well as theoretical arguments maximizing certain entropy functionals. In this paper we compute the exact distribution of 5 ′ -3 ′ distances of RNA secondary structures for any finite n. We furthermore compute the limit distribution and show that already for n = 30 the exact distribution and the limit distribution are very close. Our results show that the distances of random RNA secondary structures are distinctively lower than those of minimum free energy structures of random RNA sequences.
In this paper we present a combinatorial proof of a relation between the generating functions of unicellular and bicellular maps. This relation is a consequence of the Schwinger-Dyson equation of matrix theory. Alternatively it can be proved using representation theory of the symmetric group. Here we give a bijective proof by rewiring unicellular maps of topological genus (g + 1) into bicellular maps of genus g and pairs of unicellular maps of lower topological genera. Our result has immediate consequences for the folding of RNA interaction structures, since the time complexity of folding the transformed structure is O((n + m) 5 ), where n, m are the lengths of the respective backbones, while the folding of the original structure has O(n 6 ) time complexity, where n is the length of the longer sequence.
Abstract. In this paper we study k-noncrossing RNA structures with arc-length ≥ 3, i.e. RNA molecules in which for any i, the nucleotides labeled i and i + j (j = 1, 2) cannot form a bond and in which there are at most k − 1 mutually crossing arcs. Let S k,3 (n) denote their number. Based on a novel functional equation for the generating function P n≥0 S k,3 (n)z n , we derive for arbitrary k ≥ 3 exponential growth factors and for k = 3 the subexponential factor. Our main result is the derivation of the formula S 3,3 (n) ∼ 6.11170·4! n(n−1)... (n−4) 4.54920 n .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.