Deep learning (DL) methods accurately predict various functional properties from genomic DNA, including gene expression, promising to serve as an important tool in interpreting the full spectrum of genetic variations in personal genomes. However, systematic out-of-sample benchmarking is needed to assess the gap in their utility as personalized DNA interpreters. Using paired Whole Genome Sequencing and gene expression data we evaluate DL sequence-to-expression models, identifying their critical failure to make correct predictions on a substantial number of genomic loci, highlighting the limits of the current model training paradigm.
“γc” cytokines are a family whose receptors share a “common-gamma-chain” signaling moiety, and play central roles in differentiation, homeostasis, and communications of all immunocyte lineages. As a resource to better understand their range and specificity of action, we profiled by RNAseq the immediate-early responses to the main γc cytokines across all immunocyte lineages. The results reveal an unprecedented landscape: broader, with extensive overlap between cytokines (one cytokine doing in one cell what another does elsewhere) and essentially no effects unique to any one cytokine. Responses include a major downregulation component and a broad Myc-controlled resetting of biosynthetic and metabolic pathways. Various mechanisms appear involved: fast transcriptional activation, chromatin remodeling, and mRNA destabilization. Other surprises were uncovered: IL2 effects in mast cells, shifts between follicular and marginal zone B cells, paradoxical and cell-specific cross-talk between interferon and γc signatures, or an NKT-like program induced by IL21 in CD8+ T cells.
Intron splicing is a key regulatory step in gene expression in eukaryotes. Three sequence elements required for splicing – 5' and 3' splice sites and a branch point – are especially well-characterized in Saccharomyces cerevisiae, but our understanding of additional intron features that impact splicing in this organism is incomplete, due largely to its small number of introns. To overcome this limitation, we constructed a library in S. cerevisiae of random 50-nucleotide elements (N50) individually inserted into the intron of a reporter gene and quantified canonical splicing and the use of cryptic splice sites by sequencing analysis. More than 70% of approximately 140,000 N50 elements reduced splicing by at least 20% compared to the intron control. N50 features, including higher GC content, presence of GU repeats and stronger predicted secondary structure of its pre-mRNA, correlated with reduced splicing efficiency. A likely basis for the reduced splicing of such a large proportion of variants is the formation of RNA structures that pair N50 bases – such as the GU repeats – with other bases specifically within the reporter pre-mRNA analyzed. However, neither convolutional neural network nor linear models were able to explain more than a small fraction of the variance in splicing efficiency across the library, suggesting that complex non-linear interactions in RNA structures are not accurately captured by RNA structure prediction methods given the limited number of variants. Our results imply that the specific context of a pre-mRNA may determine the bases allowable in an intron to prevent secondary structures that reduce splicing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.