Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only~100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass spectrometry (MS/MS) data mapping expressed peptides to their encoding genomic loci, and RNA-seq data generated by ENCODE in long polyA+ and polyA-fractions in the cell lines K562 and GM12878. We used the machinelearning algorithm RuleFit3 to regress the peptide data against RNA expression data. The most important covariate for predicting translation was, surprisingly, the Cytosol polyA-fraction in both cell lines. LncRNAs are~13-fold less likely to produce detectable peptides than similar mRNAs, indicating that~92% of GENCODE v7 lncRNAs are not translated in these two ENCODE cell lines. Intersecting 9640 lncRNA loci with 79,333 peptides yielded 85 unique peptides matching 69 lncRNAs. Most cases were due to a coding transcript misannotated as lncRNA. Two exceptions were an unprocessed pseudogene and a bona fide lncRNA gene, both with open reading frames (ORFs) compromised by upstream stop codons. All potentially translatable lncRNA ORFs had only a single peptide match, indicating low protein abundance and/or false-positive peptide matches. We conclude that with very few exceptions, ribosomes are able to distinguish coding from noncoding transcripts and, hence, that ectopic translation and cryptic mRNAs are rare in the human lncRNAome.
Small molecule splicing modifiers have been previously described that target the general splicing machinery and thus have low specificity for individual genes. Several potent molecules correcting the splicing deficit of the SMN2 (survival of motor neuron 2) gene have been identified and these molecules are moving towards a potential therapy for spinal muscular atrophy (SMA). Here by using a combination of RNA splicing, transcription, and protein chemistry techniques, we show that these molecules directly bind to two distinct sites of the SMN2 pre-mRNA, thereby stabilizing a yet unidentified ribonucleoprotein (RNP) complex that is critical to the specificity of these small molecules for SMN2 over other genes. In addition to the therapeutic potential of these molecules for treatment of SMA, our work has wide-ranging implications in understanding how small molecules can interact with specific quaternary RNA structures.
In the last few years, machine learning (ML) and artificial intelligence have seen a new wave of publicity fueled by the huge and ever‐increasing amount of data and computational power as well as the discovery of improved learning algorithms. However, the idea of a computer learning some abstract concept from data and applying them to yet unseen situations is not new and has been around at least since the 1950s. Many of these basic principles are very familiar to the pharmacometrics and clinical pharmacology community. In this paper, we want to introduce the foundational ideas of ML to this community such that readers obtain the essential tools they need to understand publications on the topic. Although we will not go into the very details and theoretical background, we aim to point readers to relevant literature and put applications of ML in molecular biology as well as the fields of pharmacometrics and clinical pharmacology into perspective.
Tandem mass tag (TMT) is a multiplexing technology widely-used in proteomic research. It enables relative quantification of proteins from multiple biological samples in a single mass spectrometry run with high efficiency and high throughput. However, experiments often require more biological replicates or conditions than can be accommodated by a single run, and involve multiple TMT mixtures and multiple runs. Such larger-scale experiments combine sources of biological and technical variation in patterns that are complex, unique to TMT-based workflows, and challenging for the downstream statistical analysis. These patterns cannot be adequately characterized by statistical methods designed for other technologies, such as label-free proteomics or transcriptomics. This manuscript proposes a general statistical approach for relative protein quantification in mass spectrometry-based experiments with TMT labeling. It is applicable to experiments with multiple conditions, multiple biological replicate runs and multiple technical replicate runs, and unbalanced designs. It is based on a flexible family of linear mixed-effects models that handle complex patterns of technical artifacts and missing values. The approach is implemented in MSstatsTMT, a freely available open-source R/Bioconductor package compatible with data processing tools such as Proteome Discoverer, MaxQuant, OpenMS and SpectroMine. Evaluation on a controlled mixture, simulated datasets, and three biological investigations with diverse designs demonstrated that MSstatsTMT balanced the sensitivity and the specificity of detecting differentially abundant proteins, in particular in large-scale experiments with multiple biological mixtures.
The blood–retina barrier and blood–brain barrier (BRB/BBB) are selective and semipermeable and are critical for supporting and protecting central nervous system (CNS)-resident cells. Endothelial cells (ECs) within the BRB/BBB are tightly coupled, express high levels of Claudin-5 (CLDN5), a junctional protein that stabilizes ECs, and are important for proper neuronal function. To identify novel CLDN5 regulators (and ultimately EC stabilizers), we generated a CLDN5-P2A-GFP stable cell line from human pluripotent stem cells (hPSCs), directed their differentiation to ECs (CLDN5-GFP hPSC-ECs), and performed flow cytometry-based chemogenomic library screening to measure GFP expression as a surrogate reporter of barrier integrity. Using this approach, we identified 62 unique compounds that activated CLDN5-GFP. Among them were TGF-β pathway inhibitors, including RepSox. When applied to hPSC-ECs, primary brain ECs, and retinal ECs, RepSox strongly elevated barrier resistance (transendothelial electrical resistance), reduced paracellular permeability (fluorescein isothiocyanate-dextran), and prevented vascular endothelial growth factor A (VEGFA)-induced barrier breakdown in vitro. RepSox also altered vascular patterning in the mouse retina during development when delivered exogenously. To determine the mechanism of action of RepSox, we performed kinome-, transcriptome-, and proteome-profiling and discovered that RepSox inhibited TGF-β, VEGFA, and inflammatory gene networks. In addition, RepSox not only activated vascular-stabilizing and barrier-establishing Notch and Wnt pathways, but also induced expression of important tight junctions and transporters. Taken together, our data suggest that inhibiting multiple pathways by selected individual small molecules, such as RepSox, may be an effective strategy for the development of better BRB/BBB models and novel EC barrier-inducing therapeutics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.