Viral genomic RNA adopts many conformations during its life cycle as the genome is replicated, translated, and encapsidated. The high-resolution crystallographic structure of the satellite tobacco mosaic virus (STMV) particle reveals 30 helices of well-ordered RNA. The crystallographic data provide global constraints on the possible secondary structures for the encapsidated RNA. Traditional free energy minimization methods of RNA secondary structure prediction do not generate structures consistent with the crystallographic data, and to date no complete STMV RNA basepaired secondary structure has been generated. RNA-protein interactions and tertiary interactions may contribute a significant degree of stability, and the kinetics of viral assembly may dominate the folding process. The computational tools, Helix Find & Combine, Crumple, and Sliding Windows and Assembly, evaluate and explore the possible secondary structures for encapsidated STMV RNA. All possible hairpins consistent with the experimental data and a cotranscriptional folding and assembly hypothesis were generated, and the combination of hairpins that was most consistent with experimental data is presented as the best representative structure of the ensemble. Multiple solutions to the genome packaging problem could be an evolutionary advantage for viruses. In such cases, an ensemble of structures that share favorable global features best represents the RNA fold.
The secondary structure of encapsidated MS2 genomic RNA poses an interesting RNA folding challenge. Cryoelectron microscopy has demonstrated that encapsidated MS2 RNA is well-ordered. Models of MS2 assembly suggest that the RNA hairpin-protein interactions and the appropriate placement of hairpins in the MS2 RNA secondary structure can guide the formation of the correct icosahedral particle. The RNA hairpin motif that is recognized by the MS2 capsid protein dimers, however, is energetically unfavorable, and thus free energy predictions are biased against this motif. Computer programs called Crumple, Sliding Windows, and Assembly provide useful tools for prediction of viral RNA secondary structures when the traditional assumptions of RNA structure prediction by free energy minimization may not apply. These methods allow incorporation of global features of the RNA fold and motifs that are difficult to include directly in minimum free energy predictions. For example, with MS2 RNA the experimental data from SELEX experiments, crystallography, and theoretical calculations of the path for the series of hairpins can be incorporated in the RNA structure prediction, and thus the influence of free energy considerations can be modulated. This approach thoroughly explores conformational space and generates an ensemble of secondary structures. The predictions from this new approach can test hypotheses and models of viral assembly and guide construction of complete three-dimensional models of virus particles.
The diverse landscape of RNA conformational space includes many canyons and crevices that are distant from the lowest minimum free energy valley and remain unexplored by traditional RNA structure prediction methods. A complete description of the entire RNA folding landscape can facilitate identification of biologically important conformations. The Crumple algorithm rapidly enumerates all possible non-pseudoknotted structures for an RNA sequence without consideration of thermodynamics while filtering the output with experimental data. The Crumple algorithm provides an alternative approach to traditional free energy minimization programs for RNA secondary structure prediction. A complete computation of all non-pseudoknotted secondary structures can reveal structures that would not be predicted by methods that sample the RNA folding landscape based on thermodynamic predictions. The free energy minimization approach is often successful but is limited by not considering RNA tertiary and protein interactions and the possibility that kinetics rather than thermodynamics determines the functional RNA fold. Efficient parallel computing and filters based on experimental data make practical the complete enumeration of all non-pseudoknotted structures. Efficient parallel computing for Crumple is implemented in a ring graph approach. Filters for experimental data include constraints from chemical probing of solvent accessibility, enzymatic cleavage of paired or unpaired nucleotides, phylogenetic covariation, and the minimum number and lengths of helices determined from crystallography or cryo-electron microscopy. The minimum number and length of helices has a significant effect on reducing conformational space. Pairing constraints reduce conformational space more than single nucleotide constraints. Examples with Alfalfa Mosaic Virus RNA and Trypanosome brucei guide RNA demonstrate the importance of evaluating all possible structures when pseduoknots, RNA-protein interactions, and metastable structures are important for biological function. Crumple software is freely available at http://adenosine.chem.ou.edu/software.html.
We present new modifications to the Wuchty algorithm in order to better define and explore possible conformations for an RNA sequence. The new features, including parallelization, energy-independent lonely pair constraints, context-dependent chemical probing constraints, helix filters, and optional multibranch loops, provide useful tools for exploring the landscape of RNA folding. Chemical probing alone may not necessarily define a single unique structure. The helix filters and optional multibranch loops are global constraints on RNA structure that are an especially useful tool for generating models of encapsidated viral RNA for which cryoelectron microscopy or crystallography data may be available. The computations generate a combinatorially complete set of structures near a free energy minimum and thus provide data on the density and diversity of structures near the bottom of a folding funnel for an RNA sequence. The conformational landscapes for some RNA sequences may resemble a low, wide basin rather than a steep funnel that converges to a single structure.
Abstract-The real world is composed of sets of objects that move and morph in both space and time. Useful concepts can be defined in terms of the complex interactions between the multi-dimensional attributes of subsets of these objects and of the relationships that exist between them. In this paper, we present Spatiotemporal Multi-dimensional Relational Framework (SMRF) Trees, a new data mining technique that extends the successful Spatiotemporal Relational Probability Tree models. From a set of labeled, multi-object examples of a target concept, our algorithm infers both the set of objects that participate in the concept and the key object and relation attributes that describe the concept. In contrast to other relational model approaches, SMRF trees do not rely on pre-defined relations between objects. Instead, our algorithm infers the relations from the continuous attributes. In addition, our approach explicitly acknowledges the multi-dimensional nature of attributes such as position, orientation and color. Our method performs well in exploratory experiments, demonstrating its viability as a relational learning approach.Keywords-relational learning; continuous multi-dimensional attributes; multiple instance learning; spatial representations I. MOTIVATION AND BACKGROUNDThe world is composed of collections of objects, each with a set of associated attributes. Whether it is a robot preparing to perform the next step in a cooking sequence or an agent generating warnings of severe weather, only a specific subset of the observable objects is relevant to making decisions about what steps to take next. In particular, the relevance of an object is determined by its attributes and the relations that it has with other objects. These attributes are often continuous and multi-dimensional, such as Cartesian positions or colors in a red-green-blue (RGB) space. Given a set of training examples, our challenge is to discover the objects that play the crucial roles in the examples as well as the description of the key object attributes and relations.Our work is inspired by the successful Relational Probability Tree (RPT) [1] and the Spatiotemporal Relational Probability Tree (SRPT) [2] models. Both approaches create probability estimation trees, a form of a decision tree with probabilities at the leaves. Splits in the decision trees can ask questions about the observed properties of the objects or their relationships. Given a novel graph, these decision trees estimate the probability that the graph contains a set of objects that corresponds to some target concept. Like Kubica et al. [3], [4], these approaches build models using pre-specified categorical relations.The Spatiotemporal Multidimensional Relational Framework (SMRF) extends this prior work in two key ways. The first extension is the ability to ask questions based on continuous, multi-dimensional attributes. For example, the color of a pixel can be represented as a RGB tuple. Capturing a concept such as "yellow" requires that the blue variable be low but the green and r...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.