We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
The increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery. Cistrome is available at http://cistrome.org/ap/.
Chromatin plays a central role in eukaryotic gene regulation. We have performed genome-wide mapping of epigenetically-marked nucleosomes to determine their position both near transcription start sites and at distal regulatory elements including enhancers. In prostate cancer cells where androgen receptor (AR) binds primarily to enhancers, we found that androgen treatment dismisses a central nucleosome present over AR binding sites that is flanked by a pair of marked nucleosomes. A novel quantitative model built on the behavior of such nucleosome pairs correctly identified regions bound by the regulators of the immediate androgen response including AR and FoxA1. More importantly this model also correctly predicted novel binding sites for other transcription factors present following prolonged androgen stimulation including Oct1 and NKX3.1. Thus quantitative modeling of enhancer structure provides a powerful predictive method to infer the identity of transcription factors involved in cellular responses to specific stimuli.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.