We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Certain G-rich DNA sequences readily form four-stranded structures called G-quadruplexes. These sequence motifs are located in telomeres as a repeated unit, and elsewhere in the genome, where their function is currently unknown. It has been proposed that G-quadruplexes may be directly involved in gene regulation at the level of transcription. In support of this hypothesis, we show that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplex motif. Furthermore, these promoter quadruplexes strongly associate with nuclease hypersensitive sites identified throughout the genome via biochemical measurement. Regions of the human genome that are both nuclease hypersensitive and within promoters show a remarkable (230-fold) enrichment of quadruplex elements, compared to the rest of the genome. These quadruplex motifs identified in promoter regions also show an interesting structural bias towards more stable forms. These observations support the proposal that promoter G-quadruplexes are directly involved in the regulation of gene expression.
Guanine-rich nucleic acid sequences can adopt noncanonical four-stranded secondary structures called guanine (G)-quadruplexes1. Bioinformatics analysis suggests that G-quadruplex motifs are prevalent in genomes2, which raises the need to elucidate their function. There is now evidence for the existence of DNA G-quadruplexes at telomeres with associated biological function3. A recent hypothesis supports the notion that gene promoter elements contain DNA G-quadruplex motifs that control gene expression at the transcriptional level4. We discovered a highly conserved, thermodynamically stable RNA G-quadruplex in the 5′ untranslated region (UTR) of the gene transcript of the human NRAS proto-oncogene. Using a cell-free translation system coupled to a reporter gene assay, we have demonstrated that this NRAS RNA G-quadruplex modulates translation. This is the first example of translational repression by an RNA Gquadruplex. Bioinformatics analysis has revealed 2,922 other 5′ UTR RNA G-quadruplex elements in the human genome. We propose that RNA G-quadruplexes in the 5′ UTR modulate gene expression at the translational level.The existence of RNA G-quadruplexes in vivo is more inevitable than the existence of DNA G-quadruplexes, given that (i) the former are generally more thermodynamically stable in the folded form than their DNA counterparts5, and (ii) RNA is single-stranded, which implies that quadruplex formation does not have to compete with hybridization to a complementary strand. In this study we have focused on the 5′ UTRs of mRNA, which are known to be involved in translational regulation, particularly for growth factors, transcription factors and oncoproteins6. The neuroblastoma RAS viral oncogene homolog (NRAS)-encoded protein p21 mediates both signal transduction across the plasma membrane and the intracellular signaling pathways responsible for cell proliferation and differentiation7. Activating mutations in the coding region of NRAS are responsible for increased cell proliferation8. The suppression of oncogenic NRAS by small interfering RNA causes apoptosis of tumor cells9, which suggests that inhibiting the expression of oncogenic NRAS is a potential therapeutic strategy. Using a computational search algorithm we developed for locating quadruplex sequence motifs2, we identified a putative G-quadruplex Fig. 1 online). This motif is highly conserved, in both its sequence and its position relative to the translation start site, across the 5′ UTRs of human, chimpanzee, macaque, mouse, rat and dog genes orthologous to NRAS (Table 1 and Supplementary Table 1 online).To confirm that the putative RNA G-quadruplex NRQ folds into a stable quadruplex, we carried out biophysical experiments on the synthetic oligonucleotide 5′-UGUGGGAGGGGCGGGUCUGGG-3′. Circular dichroism (CD) spectroscopy has been widely used to characterize the structure of folded nucleic acid quadruplexes11. At pH 7.4, 100 mM KCl, the CD spectrum of NRQ showed a positive peak at 263 nm and a negative peak at 241 nm ( Fig. 1a), which is the ch...
Guanine-rich DNA sequences can form a large number of structurally diverse quadruplexes. These vary in terms of strand polarity, loop composition, and conformation. We have derived guidelines for understanding the influence of loop length on the structure adopted by intramolecular quadruplex-forming sequences, using a combination of experimental (using CD and UV melting data) and molecular modeling and simulation techniques. We find that a parallel-stranded intramolecular quadruplex structure is the only possible fold when three single residue loops are present. When single thymine loops are present in combination with longer length loops, or when all loops are longer than two residues, both parallel- and antiparallel-folded structures are able to form. Multiple conformations of each structure are likely to coexist in solution, as they were calculated to have very similar free energies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.