The wealth of the combinatorics of nucleotide base pairs enables RNA molecules to assemble into sophisticated interaction networks, which are used to create complex 3D substructures. These interaction networks are essential to shape the 3D architecture of the molecule, and also to provide the key elements to carry molecular functions such as protein or ligand binding. They are made of organised sets of long-range tertiary interactions which connect distinct secondary structure elements in 3D structures. Here, we present a de novo data-driven approach to extract automatically from large data sets of full RNA 3D structures the recurrent interaction networks (RINs). Our methodology enables us for the first time to detect the interaction networks connecting distinct components of the RNA structure, highlighting their diversity and conservation through non-related functional RNAs. We use a graphical model to perform pairwise comparisons of all RNA structures available and to extract RINs and modules. Our analysis yields a complete catalog of RNA 3D structures available in the Protein Data Bank and reveals the intricate hierarchical organization of the RNA interaction networks and modules. We assembled our results in an online database (http://carnaval.lri.fr) which will be regularly updated. Within the site, a tool allows users with a novel RNA structure to detect automatically whether the novel structure contains previously observed RINs.
Defining the molecular mechanisms of novel emerging diseases like COVID-19 is crucial to identify treatable traits to improve patient care. To circumvent a priori bias and the lack of in-depth knowledge of a new disease, we opted for an unsupervised approach, using the detailed circulating proteome, as measured by 4985 aptamers (SOMAmers), of 731 SARS-CoV-2 PCR-positive hospitalized participants to Biobanque québécoise de la COVID-19 (BQC19). The consensus clustering identified six endophenotypes (EPs) present in this cohort, with varying degrees of disease severity. One endophenotype, EP6, was associated with a greater proportion of ICU admission, mechanical ventilation, acute respiratory distress syndrome (ARDS) and death. Clinical features of this endophenotype, showed increased levels of C-reactive protein, D-dimers, elevated neutrophils, and depleted lymphocytes. Moreover, metabolomic analysis supported a role for immunothrombosis in severe COVID-19 ARDS. Furthermore, the approach enabled the identification of Fibroblast Growth Factor Receptor (FGFR) and SH2-containing transforming protein 4 (SHC4) signaling as features of the molecular pathways associated with severe COVID-19. Finally, this information was sufficient to train an accurate predictive model solely based on clinical laboratory measurements, suggesting the use of blood markers as surrogates for generalizing these EPs to new patients and automating identification of high-risk groups in the clinic.
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.