Direct coupling analysis of nucleotide coevolution provides a novel approach to identify which nucleotides in an RNA molecule are likely in direct contact, and this information obtained from sequence only can be used to predict RNA 3D structures with much improved accuracy. Here we present an efficient method that incorporates this information into current RNA 3D structure prediction methods, specifically 3dRNA. Our method makes much more accurate RNA 3D structure prediction than the original 3dRNA as well as other existing prediction methods that used the direct coupling analysis. In particular our method demonstrates a significant improvement in predicting multi-branch junction conformations, a major bottleneck for RNA 3D structure prediction. We also show that our method can be used to optimize the predictions by other methods. These results indicate that optimization of RNA 3D structure prediction using evolutionary restraints of nucleotide–nucleotide interactions from direct coupling analysis offers an efficient way for accurate RNA tertiary structure predictions.
Noncoding RNAs play important roles in cell and their secondary structures are vital for understanding their tertiary structures and functions. Many prediction methods of RNA secondary structures have been proposed but it is still challenging to reach high accuracy, especially for those with pseudoknots. Here we present a coupled deep learning model, called 2dRNA, to predict RNA secondary structure. It combines two famous neural network architectures bidirectional LSTM and U-net and only needs the sequence of a target RNA as input. Benchmark shows that our method can achieve state-of-the-art performance compared to current methods on a testing dataset. Our analysis also shows that 2dRNA can learn structural information from similar RNA sequences without aligning them.
Deep learning methods for RNA secondary structure prediction have shown higher performance than traditional methods, but there is still much room to improve. It is known that the lengths of RNAs are very different, as are their secondary structures. However, the current deep learning methods all use length-independent models, so it is difficult for these models to learn very different secondary structures. Here, we propose a length-dependent model that is obtained by further training the length-independent model for different length ranges of RNAs through transfer learning. 2dRNA, a coupled deep learning neural network for RNA secondary structure prediction, is used to do this. Benchmarking shows that the length-dependent model performs better than the usual length-independent model.
RNA molecules participate in many important biological processes, and they need to fold into well-defined secondary and tertiary structures to realize their functions. Like the well-known protein folding problem, there is also an RNA folding problem. The folding problem includes two aspects: structure prediction and folding mechanism. Although the former has been widely studied, the latter is still not well understood. Here we present a deep reinforcement learning algorithms 2dRNA-Fold to study the fastest folding paths of RNA secondary structure. 2dRNA-Fold uses a neural network combined with Monte Carlo tree search to select residue pairing step by step according to a given RNA sequence until the final secondary structure is formed. We apply 2dRNA-Fold to several short RNA molecules and one longer RNA 1Y26 and find that their fastest folding paths show some interesting features. 2dRNA-Fold is further trained using a set of RNA molecules from the dataset bpRNA and is used to predict RNA secondary structure. Since in 2dRNA-Fold the scoring to determine next step is based on possible base pairings, the learned or predicted fastest folding path may not agree with the actual folding paths determined by free energy according to physical laws.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.