DNA cytosine methylation is a central epigenetic marker that is usually mutagenic and may increase the level of sequence divergence. However, methylated genes have been reported to evolve more slowly than unmethylated genes. Hence, there is a controversy on whether DNA methylation is correlated with increased or decreased protein evolutionary rates. We hypothesize that this controversy has resulted from the differential correlations between DNA methylation and the evolutionary rates of coding exons in different genic positions. To test this hypothesis, we compare human-mouse and human-macaque exonic evolutionary rates against experimentally determined single-base resolution DNA methylation data derived from multiple human cell types. We show that DNA methylation is significantly related to within-gene variations in evolutionary rates. First, DNA methylation level is more strongly correlated with C-to-T mutations at CpG dinucleotides in the first coding exons than in the internal and last exons, although it is positively correlated with the synonymous substitution rate in all exon positions. Second, for the first exons, DNA methylation level is negatively correlated with exonic expression level, but positively correlated with both nonsynonymous substitution rate and the sample specificity of DNA methylation level. For the internal and last exons, however, we observe the opposite correlations. Our results imply that DNA methylation level is differentially correlated with the biological (and evolutionary) features of coding exons in different genic positions. The first exons appear more prone to the mutagenic effects, whereas the other exons are more influenced by the regulatory effects of DNA methylation.methylation-associated mutation | exon evolution | genomics | deep sequencing | bioinformatics D NA methylation is a common form of epigenetic modification that is important for a variety of biological functions, including transcriptional silencing (1), genomic imprinting (2), X-chromosome inactivation (3), the silencing of transposons (4), tumorigenesis (5), and the differentiation of pluripotent cells (6). DNA methylation is unevenly distributed in the human genome. For example, the CpG islands near the promoter regions of active genes tend to be unmethylated (7-10), because methylation in these regions is strongly correlated with transcriptional suppression (8). It has also been reported that exons tend to be more methylated than introns, and that sharp transitions of methylation occur at exonintron boundaries in human (6). Moreover, the level of DNA methylation also varies among exonic regions. The levels of DNA methylation in the first exons were reported to be lower than in downstream exons, and tightly linked to transcriptional silencing (11). In addition, the level of DNA methylation exhibits a slight but sharp step-down at the transcriptional terminal sites (6).In terms of molecular evolution, DNA methylation is relevant in two aspects. First, DNA methylation can significantly increase the rate of spontan...
Summary: CNVVdb is a web interface for identification of putative copy number variations (CNVs) among 16 vertebrate species using the-same-species self-alignments and cross-species pairwise alignments. By querying genomic coordinates in the target species, all the potential paralogous/orthologous regions that overlap ≥80–100% (adjustable) of the query sequences with user-specified sequence identity (≥60%∼≥90%) are returned. Additional information is also given for the genes that are included in the returned regions, including gene description, alternatively spliced transcripts, gene ontology descriptions and other biologically important information. CNVVdb also provides information of pseudogenes and single nucleotide polymorphisms (SNPs) for the CNV-related genomic regions. Moreover, multiple sequence alignments of shared CNVs across species are also provided. With the combination of CNV, SNP, pseudogene and functional information, CNVVdb can be very useful for comparative and functional studies in vertebrates.Availability: CNVVdb is freely accessible at http://CNVVdb.genomics.sinica.edu.tw.Contact: trees@gate.sinica.edu.tw
BackgroundComplex human diseases may be associated with many gene interactions. Gene interactions take several different forms and it is difficult to identify all of the interactions that are potentially associated with human diseases. One approach that may fill this knowledge gap is to infer previously unknown gene interactions via identification of non-physical linkages between different mutations (or single nucleotide polymorphisms, SNPs) to avoid hitchhiking effect or lack of recombination. Strong non-physical SNP linkages are considered to be an indication of biological (gene) interactions. These interactions can be physical protein interactions, regulatory interactions, functional compensation/antagonization or many other forms of interactions. Previous studies have shown that mutations in different genes can be linked to the same disorders. Therefore, non-physical SNP linkages, coupled with knowledge of SNP-disease associations may shed more light on the role of gene interactions in human disorders. A user-friendly web resource that integrates information about non-physical SNP linkages, gene annotations, SNP information, and SNP-disease associations may thus be a good reference for biomedical research.FindingsHere we extracted the SNPs located within the promoter or exonic regions of protein-coding genes from the HapMap database to construct a database named the Linkage-Disequilibrium-based Gene Interaction database (LDGIdb). The database stores 646,203 potential human gene interactions, which are potential interactions inferred from SNP pairs that are subject to long-range strong linkage disequilibrium (LD), or non-physical linkages. To minimize the possibility of hitchhiking, SNP pairs inferred to be non-physically linked were required to be located in different chromosomes or in different LD blocks of the same chromosomes. According to the genomic locations of the involved SNPs (i.e., promoter, untranslated region (UTR) and coding region (CDS)), the SNP linkages inferred were categorized into promoter-promoter, promoter-UTR, promoter-CDS, CDS-CDS, CDS-UTR and UTR-UTR linkages. For the CDS-related linkages, the coding SNPs were further classified into nonsynonymous and synonymous variations, which represent potential gene interactions at the protein and RNA level, respectively. The LDGIdb also incorporates human disease-association databases such as Genome-Wide Association Studies (GWAS) and Online Mendelian Inheritance in Man (OMIM), so that the user can search for potential disease-associated SNP linkages. The inferred SNP linkages are also classified in the context of population stratification to provide a resource for investigating potential population-specific gene interactions.ConclusionThe LDGIdb is a user-friendly resource that integrates non-physical SNP linkages and SNP-disease associations for studies of gene interactions in human diseases. With the help of the LDGIdb, it is plausible to infer population-specific SNP linkages for more focused studies, an avenue that is potentially important f...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.