Summary
The distance geometry problem is often encountered in molecular biology and the life sciences at large, as a host of experimental methods produce ambiguous and noisy distance data. In this note, we present diSTruct; an adaptation of the generic MaxEnt-Stress graph drawing algorithm to the domain of biological macromolecules. diSTruct is fast, provides reliable structural models even from incomplete or noisy distance data and integrates access to graph analysis tools.
Availability and implementation
diSTruct is written in C++, Cython and Python 3. It is available from https://github.com/KIT-MBS/distruct.git or in the Python package index under the MIT license.
Supplementary information
Supplementary data are available at Bioinformatics online.
On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatial adjacencies (”contact maps”) as a proxy for 3D structure. We explore the space of self-supervised learning for RNA multiple sequence alignments and focus on downstream contact prediction from latent attention maps. Boosted decision trees in particular advance contact prediction quality and can be further enhanced by finetuning the pretrained backbone. Impressively, they double the precision of contact prediction/ reduce false
positives by a factor of five over the baseline. We name our model BARNACLE. Our conceptional advance could prove a breakthrough in decreasing the sequence-structure gap for RNA and is generalize-able to other tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.