A complex landscape of genomic regulatory elements underpins patterns of metazoan gene expression, yet it has been technically difficult to disentangle composite regulatory elements within their endogenous genomic context. Expression of the Sox2 transcription factor (TF) in mouse embryonic stem cells (mESCs) depends on a distal regulatory cluster of DNase I hypersensitive sites (DHSs), but the contributions of individual DHSs and their degree of independence remain a mystery. Here, we comprehensively analyze the regulatory architecture of the Sox2 locus in mESCs using Big-IN to scarlessly deliver payloads ranging up to 143 kb, permitting deletions, rearrangements and inversions of single or multiple DHSs, and surgical alterations to individual TF recognition sequences. Multiple independent mESC clones were derived for each payload, extensively sequence-verified, and profiled for expression of Sox2 specifically from the engineered allele. We find that a single core DHS comprising a handful of key TF recognition sequences is sufficient to sustain significant expression in mESCs, though its contribution is modulated by additional DHSs. Moreover, their overall activity is influenced by specific DHS order and/or orientation effects. We built a highly predictive model for locus regulation which includes nonlinear components indicating both synergy and redundancy among. Our results suggest that composite regulatory elements and their influence on gene expression can be resolved to a tractable set of sequence features using synthetic regulatory genomics.
Enhancer function is frequently investigated piecemeal using truncated reporter assays or single deletion analysis, thus it remains unclear to what extent their function is influenced by surrounding genomic context. Using our Big-IN technology for targeted integration of large DNAs, we analyzed the regulatory architecture of the Igf2/H19 locus, a paradigmatic model of enhancer selectivity. We assembled payloads containing a 157-kb functional Igf2/H19 locus, engineering mutations to genetically direct CTCF occupancy at the imprinting control region (ICR) that switches the target gene of the H19 enhancer cluster. Contrasting their activity when delivered to the endogenous locus or to a safe harbor locus (Hprt) revealed that the functional elements comprising the Igf2/H19 locus are highly sensitive to their native context. Exchanging components of the Igf2/H19 locus with the well-studied Sox2 locus showed that the H19 enhancer cluster in particular functions poorly out of context, and required its native surroundings to activate Sox2. Conversely, the Sox2 locus control region (LCR) could activate Igf2 and H19, but its activity was only partially modulated by CTCF occupancy at the ICR. Analysis of regulatory DNA actuation across cell types showed that, while H19 is tightly coordinated within its native locus, the Sox2 LCR is more independent. We show that these enhancer clusters typify broader classes of loci genome-wide. Our synthetic regulatory genomics approach shows that unexpected dependencies may influence even the most studied functional elements and permits large-scale manipulation of complete loci to investigate how locus architecture relates to function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.