There are 481 segments longer than 200 bp that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat and mouse genomes. Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95% and 99% identity, respectively. Many are also significantly conserved in fish. These ultraconserved elements of the human genome are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in regulation of transcription and development. Along with more than 5,000 sequences of over 100bp that are absolutely conserved among the three sequenced mammals, these represent a class of genetic elements whose functions and evolutionary origins are yet to be determined, but which are more highly conserved between these species than proteins, and appear to be essential for the ontogeny of mammals and other vertebrates.Although only about 1.2% of the human genome appears to code for protein (1-3), it has been estimated that as much as 5% is more conserved than expected from neutral evolution since the split with rodents, and hence may be under negative or "purifying" selection (4-6). Several studies have found specific non-coding segments in the human genome that appear to be under selection, using a threshold for conservation of 70% or 80% identity with mouse over more than 100bp (7-13). A study of these elements on human chromosome 21 found that those that were very highly conserved in multiple species contained significant numbers of non-coding elements (13). Similar results were found comparing the human, mouse and rat (14, 15) in a study of the 1.8 Mb CFTR region (16,17), and in a functional study of the SIM2 locus in a number of mammalian species (18). We determined the longest segments of the human genome that are maximally conserved with orthologous segments in rodents: those showing 100% identity and with no insertions or deletions in their alignment with mouse and rat. Exclusive of ribosomal RNA regions, there are 481 such segments longer than 200bp that we call ultraconserved elements (table S1). They are widely distributed in the genome (on all chromosomes except chromosomes 21 and Y), and are often found in clusters (Fig. 1). The probability is less than 10 -22 of finding even one such element in 2.9 billion bases under a simple model of neutral evolution with independent substitutions at each site, using the slowest neutral substitution rate that is observed for any 1 Mb region of the genome (supporting text, section S1). Nearly all of these elements also exhibited extremely high levels of conservation with orthologous regions in the chicken genome (467/481 = 97% of the elements aligning at an average of 95.7% identity, 29 at 100% identity), and about two-thirds of them with the fugu genome as well (324/481 = 67.3% of the elements aligning at an average of 76.8% identity), despite the fact that only about 4% of the human genome can be reliably aligned to the chicken ...
The University of California, Santa Cruz (UCSC) Genome Browser website (http://genome.ucsc.edu/) provides a large database of publicly available sequence and annotation data along with an integrated tool set for examining and comparing the genomes of organisms, aligning sequence to genomes, and displaying and sharing users’ own annotation data. As of September 2009, genomic sequence and a basic set of annotation ‘tracks’ are provided for 47 organisms, including 14 mammals, 10 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms and a yeast. New data highlights this year include an updated human genome browser, a 44-species multiple sequence alignment track, improved variation and phenotype tracks and 16 new genome-wide ENCODE tracks. New features include drag-and-zoom navigation, a Wiki track for user-added annotations, new custom track formats for large datasets (bigBed and bigWig), a new multiple alignment output tool, links to variation and protein structure tools, in silico PCR utility enhancements, and improved track configuration tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.