Understanding how gene expression is regulated on a global scale is important for determining how basic processes such as cell proliferation, cell differentiation and responses to environmental signals are controlled. Three papers now show that it is possible to identify binding sites for key transcription factors in human cells on a chromosome level.Cellular and developmental processes are controlled in large part by transcription factors whose binding sites are small (average size 6-9 bp) and highly degenerate. How, then, does one find the regulatory elements in a sea of base pairs numbering in the billions?The chromatin immunoprecipitation and microarray (ChIP-chip) method involves immunoprecipitating chromatin associated with a transcription factor of interest and then probing a genomic DNA array to identify sites bound by the factor (Fig. 1). This strategy was originally established in yeast using genomic DNA arrays containing all 6,000 intergenic regions spotted on a single microscope slide 1,2 . Several groups recently adapted this technology to identify transcription factor binding sites in human cells for selected regions of the human genome 3-5 . Two teams, one involving a collaboration between Affymetrix and Harvard 6 and the other at Yale 7,8 , have now extended this technology to map binding sites along entire human chromosomes.Martone et al. 7 and Euskirchen et al. 8 mapped the binding sites of NF-κB (p65) and CREB, respectively, along human chromosome 22, and Cawley et al. 6 mapped the binding sites of Sp1, c-Myc and p53 on chromosomes 21 and 22. In each case, genomic tiling arrays were used. The Affymetrix-Harvard group used oligonucleotide arrays containing 25-bp oligonucleotide pairs, with an average spacing of 35 bp. The Yale group used a PCR-based tiling array containing 21,000 products with an average size of 820 bp.In the cases of NF-κB, CREB, c-Myc and Sp1, the researchers found an extraordinary number of binding sites along each chromosome studied. The numbers extrapolate to a total of 12,000 and 25,000 binding sites across the entire nonrepetitive regions of the genome. A considerable number of such sites (1,600) are probably also present for p53. Binding was noted near well-annotated genes as well as new transcribed regions whose function is unknown. For the case of annotated genes, the results provide new potential insights into the functions of these transcription factors. For example, many putative CREB targets are involved in neuronal function or signal transduction, with the potential to up-or downregulate the CREB signaling pathway. In addition, several potential targets of CREB and NF-κB are themselves transcription factors, suggestive of the existence of possible regulatory cascades.
Wide distributionOne notable observation is that binding sites lie not only in the immediate vicinity of the transcription start site, but also at distal sites and within genes. These studies show that 9-27% of binding sites lie within 1 kb of the transcription start sites of genes. The results for ...