cis-regulatory modules that control developmental gene expression process the regulatory inputs provided by the transcription factors for which they contain specific target sites. A prominent class of cis-regulatory processing functions can be modeled as logic operations. Many of these are combinatorial because they are mediated by multiple sites, although others are unitary. In this work, we illustrate the repertoire of cis-regulatory logic operations, as an approach toward a functional interpretation of the genomic regulatory code.T he differential control of gene expression during development depends primarily on transcription factor interactions with cis-regulatory modules (CRMs) that may be located upstream, downstream, or in the introns of a gene. The potential regulatory functions encoded in the DNA sequence of these modular control units are specified by the combinations of transcription factor target sites they contain, and, typically, a CRM will include sites for four to eight different interactions (1, 2). In the last analysis, it may be said that we will understand the genomic regulatory code only when we can interpret its functional significance by inspection, because it has been possible for decades to recognize protein coding sequence. However, at present, we cannot even recognize many cis-regulatory target sites; nor, perhaps more importantly, can we specify predictively, or, in some cases, even properly name, the elemental functions mediated by the individual sites within a CRM. Here, we take a step toward analysis of the repertoire of elemental cis-regulatory functions.For a developmentally expressed gene, regulatory control always depends in part on transcription factors presented variably in embryonic time and space. In the following, we use the term "driver" for such factors. These drivers provide spatial and temporal inputs (positive and negative) reflected in the regulatory output of the relevant CRM and in the resulting pattern of gene expression. However, target sites for driver inputs (3) may often account for only a minority of specific CRM target sites. Furthermore, the regulatory outputs of a CRM never exactly equal any of its inputs. This finding can be perceived explicitly when the inputs and outputs are hooked together in a gene regulatory network (2, 4-6). Instead, the CRM processes the driver inputs in a variety of complex ways, depending on its genomic design and, to some extent, on its genomic environs. We find that there is a class of fundamental processing functions mediated by specific CRM target sites and combinations of sites that have the behavior of logic operations, and it is on these sites that we focus herein.
Initial Insights from endo16Endo16 is a developmentally regulated gene of the sea urchin embryo expressed in endodermal territory. In respect to its genomic regulatory code, endo16 may be the best understood of any developmentally active gene. The functional significance of every detectable target site in the two key CRMs of this gene was determined by mutation...