We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.
BackgroundAdenosine-to-inosine (A-to-I) editing is a site-selective post-transcriptional alteration of double-stranded RNA by ADAR deaminases that is crucial for homeostasis and development. Recently the Mouse Genomes Project generated genome sequences for 17 laboratory mouse strains and rich catalogues of variants. We also generated RNA-seq data from whole brain RNA from 15 of the sequenced strains.ResultsHere we present a computational approach that takes an initial set of transcriptome/genome mismatch sites and filters these calls taking into account systematic biases in alignment, single nucleotide variant calling, and sequencing depth to identify RNA editing sites with high accuracy. We applied this approach to our panel of mouse strain transcriptomes identifying 7,389 editing sites with an estimated false-discovery rate of between 2.9 and 10.5%. The overwhelming majority of these edits were of the A-to-I type, with less than 2.4% not of this class, and only three of these edits could not be explained as alignment artifacts. We validated 24 novel RNA editing sites in coding sequence, including two non-synonymous edits in the Cacna1d gene that fell into the IQ domain portion of the Cav1.2 voltage-gated calcium channel, indicating a potential role for editing in the generation of transcript diversity.ConclusionsWe show that despite over two million years of evolutionary divergence, the sites edited and the level of editing at each site is remarkably consistent across the 15 strains. In the Cds2 gene we find evidence for RNA editing acting to preserve the ancestral transcript sequence despite genomic sequence divergence.
In 100 primary colorectal carcinomas, we demonstrate by array comparative genomic hybridization (aCGH) that 33% show DNA copy number (DCN) loss involving PARK2, the gene encoding PARKIN, the E3 ubiquitin ligase whose deficiency is responsible for a form of autosomal recessive juvenile parkinsonism. PARK2 is located on chromosome 6 (at 6q25-27), a chromosome with one of the lowest overall frequencies of DNA copy number alterations recorded in colorectal cancers. The PARK2 deletions are mostly focal (31% ∼0.5 Mb on average), heterozygous, and show maximum incidence in exons 3 and 4. As PARK2 lies within FRA6E, a large common fragile site, it has been argued that the observed DCN losses in PARK2 in cancer may represent merely the result of enforced replication of locally vulnerable DNA. However, we show that deficiency in expression of PARK2 is significantly associated with adenomatous polyposis coli (APC) deficiency in human colorectal cancer. Evidence of some PARK2 mutations and promoter hypermethylation is described. PARK2 overexpression inhibits cell proliferation in vitro. Moreover, interbreeding of Park2 heterozygous knockout mice with Apc Min mice resulted in a dramatic acceleration of intestinal adenoma development and increased polyp multiplicity. We conclude that PARK2 is a tumor suppressor gene whose haploinsufficiency cooperates with mutant APC in colorectal carcinogenesis.array | comparative genomic hybridization | mouse model | PARKIN
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.