RNA molecules with novel functions have revived interest in the accurate prediction of RNA three-dimensional (3D) structure and folding dynamics. However, existing methods are inefficient in automated 3D structure prediction. Here, we report a robust computational approach for rapid folding of RNA molecules. We develop a simplified RNA model for discrete molecular dynamics (DMD) simulations, incorporating base-pairing and base-stacking interactions. We demonstrate correct folding of 150 structurally diverse RNA sequences. The majority of DMD-predicted 3D structures have <4 Å deviations from experimental structures. The secondary structures corresponding to the predicted 3D structures consist of 94% native base-pair interactions. Folding thermodynamics and kinetics of tRNA Phe , pseudoknots, and mRNA fragments in DMD simulations are in agreement with previous experimental findings. Folding of RNA molecules features transient, non-native conformations, suggesting nonhierarchical RNA folding. Our method allows rapid conformational sampling of RNA folding, with computational time increasing linearly with RNA length. We envision this approach as a promising tool for RNA structural and functional analyses.
We describe a technique for the detection and localization of RNA transcripts in living cells. The method is based on fluorescent-protein complementation regulated by the interaction of a split RNA-binding protein with its corresponding RNA aptamer. In our design, the RNA-binding protein is the eukaryotic initiation factor 4A (eIF4A). eIF4A is dissected into two fragments, and each fragment is fused to split fragments of the enhanced green fluorescent protein (EGFP). Coexpression of the two protein fusions in the presence of a transcript containing eIF4A-interacting RNA aptamer resulted in the restoration of EGFP fluorescence in Escherichia coli cells. We also applied this technique to the visualization of an aptamer-tagged mRNA and 5S ribosomal RNA (rRNA). We observed distinct spatial and temporal changes in fluorescence within single cells, reflecting the nature of the transcript.
We describe here the identification of eight polymorphic microsatellite loci with (CA) n repeats in the Trypanosoma cruzi genome based on the affinity capture of fragments using biotinylated (CA) 12 attached to streptavidincoated magnetic beads. The presence of two peaks in PCR amplification products from individual clones confirmed that T. cruzi is diploid. Hardy-Weinberg and linkage disequilibrium analyses suggested that sexual reproduction is rare or absent and that the population structure is clonal. Several strains, especially those isolated from nonhuman sources, showed more than two alleles in many loci demonstrating that they were multiclonal. The phylogenetic analysis of T. cruzi based on microsatellites revealed a great genetic distance among strains, although the strain dispersion profile in the Wagner network was in general agreement with the species dimorphism found by PCR amplification of the divergent region of the rRNA 24S␣ gene.
Large-scale analysis of the GC-content distribution at the gene level reveals both common features and basic differences in genomes of different groups of species. Sharp changes in GC content are detected at the transcription boundaries for all species analyzed, including human, mouse, rat, chicken, fruit fly, and worm. However, two substantially distinct groups of GC-content profiles can be recognized: warm-blooded vertebrates including human, mouse, rat, and chicken, and invertebrates including fruit fly and worm. In vertebrates, sharp positive and negative spikes of GC content are observed at the transcription start and stop sites, respectively, and there is also a progressive decrease in GC content from the 5 untranslated region to the 3 untranslated region along the gene. In invertebrates, the positive and negative GC-content spikes at the transcription start and stop sites are preceded by spikes of opposite value, and the highest GC content is found in the coding regions of the genes. Cross-correlation analysis indicates high frequencies of GC-content spikes at transcription start and stop sites. The strong conservation of this genomic feature seen in comparisons of the human͞mouse and human͞rat orthologs, and the clustering of genes with GC-content spikes on chromosomes imply a biological function. The GC-content spikes at transcription boundaries may reflect a general principle of genomic punctuation. Our analysis also provides means for identifying these GC-content spikes in individual genomic sequences.gene clustering ͉ gene ontology ͉ transcription start site ͉ transcription stop site
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.