2005
DOI: 10.1101/gr.3577405
|View full text |Cite
|
Sign up to set email alerts
|

Distribution and intensity of constraint in mammalian genomic sequence

Abstract: Comparisons of orthologous genomic DNA sequences can be used to characterize regions that have been subject to purifying selection and are enriched for functional elements. We here present the results of such an analysis on an alignment of sequences from 29 mammalian species. The alignment captures ∼3.9 neutral substitutions per site and spans ∼1.9 Mbp of the human genome. We identify constrained elements from 3 bp to over 1 kbp in length, covering ∼5.5% of the human locus. Our estimate for the total amount of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

14
1,238
1
5

Year Published

2006
2006
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,257 publications
(1,258 citation statements)
references
References 65 publications
14
1,238
1
5
Order By: Relevance
“…The first SNV located on chromosome 4 (position 189,411,955) is heterozygous in the healthy twin. This variant is located in a genomic evolutionary rate profiling (GERP) constraint element of 135 bp with the closest gene >100 kb away 31 . Intriguingly, the second SNV turned out to be a mosaic variant with a higher frequency of the variant allele in the schizophrenic twin.…”
Section: True Genetic Differences In Monozygotic Twinsmentioning
confidence: 99%
“…The first SNV located on chromosome 4 (position 189,411,955) is heterozygous in the healthy twin. This variant is located in a genomic evolutionary rate profiling (GERP) constraint element of 135 bp with the closest gene >100 kb away 31 . Intriguingly, the second SNV turned out to be a mosaic variant with a higher frequency of the variant allele in the schizophrenic twin.…”
Section: True Genetic Differences In Monozygotic Twinsmentioning
confidence: 99%
“…Only SNVs and InDels of up to 30 bp and found within 150 bp of the ends of the enriched targets were considered for subsequent analysis. Functional annotation of high-quality variants was performed using Annovar, 11 providing a comparison of the predicted variants to the National Center for Biotechnology Information (NCBI) SNP Database build 132 (dbSNP132), the March 2010 pilot release of the 1000 Genomes project (1000G; www.1000genomes.org), conservation around variants based on phastCons, 12 segmental duplication filter, gene annotation (exon/intron/UTR), amino-acid substitutions and splice variants based on UCSC Genome Browser 13 tracks, as well as multiple estimates of the impact of amino-acid substitution on the structure and function of proteins (tools: Sift, 14 Polyphen2, 15 PhyloP, 16 and MutationTaster 17 ). The reference sequences used for the four genes targeted in this study were NM_000277 (PAH), NM_000320 (QDPR), NM_000161 (GCH1), and NM_000317 (PTS).…”
Section: Bioinformatics Analysis Of Dna Variantsmentioning
confidence: 99%
“…4. Variants were ranked based on evolutionary conservation and potential deleteriousness of the affected nucleotide using Sift, 14 Polyphen2, 15 PhyloP, 16 and MutationTaster. 17 All newly identified variants identified in this study have been submitted to PAHdb (http://www.pahdb.mcgill.ca/), which is the reference database for hyperphenylalaninemia mutations.…”
Section: Identification Of Pku and Bh4dh Mutationsmentioning
confidence: 99%
“…In the comparative genomics community, much attention has focused on two problems in particular: (1) identifying (especially noncoding) sequences that are unusually conserved across species, and thus are likely to be subject to negative selection (e.g., [3][4][5][6]); and (2) identifying protein-coding genes that show unusually high d N /d S ratios, and thus might be subject to positive selection (e.g., [7][8][9][10][11][12]). Methods focused on problem (1) generally have made the assumption (explicitly or implicitly) that selectional pressures are the same across all branches of a phylogeny-i.e., that each candidate sequence is under selection in all species or not under selection in any species.…”
Section: Introductionmentioning
confidence: 99%