Gene expression in human tissue has primarily been studied on the transcriptional level, largely neglecting translational regulation. Here, we analyze the translatomes of 80 human hearts to identify new translation events and quantify the effect of translational regulation. We show extensive translational control of cardiac gene expression, which is orchestrated in a process-specific manner. Translation downstream of predicted disease-causing proteintruncating variants appears to be frequent, suggesting inefficient translation termination. We identify hundreds of previously undetected microproteins, expressed from lncRNAs and circRNAs, for which we validate the protein products in vivo. The translation of microproteins is not restricted to the heart and prominent in the translatomes of human kidney and liver. We associate these microproteins with diverse cellular processes and compartments and find that many locate to the mitochondria. Importantly, dozens of microproteins are translated from lncRNAs with well-characterized noncoding functions, indicating previously unrecognized biology.
Upstream open reading frames (uORFs) are tissue-specific cis-regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORFperturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.