This is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.
The arrangement of transcription factor (TF) binding motifs (syntax) is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution ChIP-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using CRISPR-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
The rapid encoding of contextual memory requires the CA3 region of hippocampus, but the necessary genetic pathways remain unclear. We found that the activity-dependent transcription factor Npas4 regulates a transcriptional program in CA3 that is required for contextual memory formation. Npas4 was specifically expressed in CA3 after contextual learning. Global knockout or selective deletion of Npas4 in CA3 both resulted in impaired contextual memory, and restoration of Npas4 in CA3 was sufficient to reverse the deficit in global knockout mice. By recruiting RNA polymerase II to promoters and enhancers of target genes, Npas4 regulates a learning-specific transcriptional program in CA3 that includes many well-known activity-regulated genes, suggesting that Npas4 is a master regulator of activity-regulated gene programs and is central to memory formation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.