To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.
The origin recognition complex (ORC) is an essential DNA replication initiation factor conserved in all eukaryotes. In Saccharomyces cerevisiae, ORC binds to specific DNA elements; however, in higher eukaryotes, ORC exhibits little sequence specificity in vitro or in vivo. We investigated the genome-wide distribution of ORC in Drosophila and found that ORC localizes to specific chromosomal locations in the absence of any discernible simple motif. Although no clear sequence motif emerged, we were able to use machine learning approaches to accurately discriminate between ORC-associated sequences and ORC-free sequences based solely on primary sequence. The complex sequence features that define ORC binding sites are highly correlated with nucleosome positioning signals and likely represent a preferred nucleosomal landscape for ORC association. Open chromatin appears to be the underlying feature that is deterministic for ORC binding. ORC-associated sequences are enriched for the histone variant, H3.3, often at transcription start sites, and depleted for bulk nucleosomes. The density of ORC binding along the chromosome is reflected in the time at which a sequence replicates, with early replicating sequences having a high density of ORC binding. Finally, we found a high concordance between sites of ORC binding and cohesin loading, suggesting that, in addition to DNA replication, ORC may be required for the loading of cohesin on DNA in Drosophila.
Mutational heterogeneity must be taken into account when reconstructing evolutionary histories, calibrating molecular clocks, and predicting links between genes and disease. Selective pressures and various DNA transactions have been invoked to explain the heterogeneous distribution of genetic variation between species, within populations, and in tissuespecific tumors. To examine relationships between such heterogeneity and variations in leading-and lagging-strand replication fidelity and mismatch repair, we accumulated 40,000 spontaneous mutations in eight diploid yeast strains in the absence of selective pressure. We found that replicase error rates vary by fork direction, coding state, nucleosome proximity, and sequence context. Further, error rates and DNA mismatch repair efficiency both vary by mismatch type, responsible polymerase, replication time, and replication origin proximity. Mutation patterns implicate replication infidelity as one driver of variation in somatic and germline evolution, suggest mechanisms of mutual modulation of genome stability and composition, and predict future observations in specific cancers.
DNA replication initiates from thousands of start sites throughout the Drosophila genome and must be coordinated with other ongoing nuclear processes such as transcription to ensure genetic and epigenetic inheritance. Considerable progress has been made toward understanding how chromatin modifications regulate the transcription program; in contrast, we know relatively little about the role of the chromatin landscape in defining how start sites of DNA replication are selected and regulated. Here, we describe the Drosophila replication program in the context of the chromatin and transcription landscape for multiple cell lines using data generated by the modENCODE consortium. We find that while the cell lines exhibit similar replication programs, there are numerous cell line-specific differences that correlate with changes in the chromatin architecture. We identify chromatin features that are associated with replication timing, early origin usage, and ORC binding. Primary sequence, activating chromatin marks, and DNA-binding proteins (including chromatin remodelers) contribute in an additive manner to specify ORC-binding sites. We also generate accurate and predictive models from the chromatin data to describe origin usage and strength between cell lines. Multiple activating chromatin modifications contribute to the function and relative strength of replication origins, suggesting that the chromatin environment does not regulate origins of replication as a simple binary switch, but rather acts as a tunable rheostat to regulate replication initiation events.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.