Here we provide the first genome-wide, high-resolution map of the phylogenetic origin of the genome of most extant laboratory mouse inbred strains. Our analysis is based on the genotypes of wild caught mice from three subspecies of Mus musculus. We demonstrate that classical laboratory strains are derived from a few fancy mice with limited haplotype diversity. Their genomes are overwhelmingly M. m. domesticus in origin and the remainder is mostly of Japanese origin. We generated genome-wide haplotype maps based on identity by descent from fancy mice and demonstrate that classical inbred strains have limited and non-randomly distributed genetic diversity. In contrast, wild-derived laboratory strains represent a broad sampling of diversity within M. musculus. Intersubspecific introgression is pervasive in these strains and contamination by laboratory stocks has played role in this process. The subspecific origin, haplotype diversity and identity by descent maps can be visualized and searched online.
The genome of the laboratory mouse is thought to be a mosaic of regions with distinct subspecific origins. We have developed a high-resolution map of the origin of the laboratory mouse by generating 25,400 phylogenetic trees in 100 kb intervals spanning the genome. On average 92% of the genome is of M. m. domesticus origin and the distribution of diversity is strikingly non random among the chromosomes. There are large regions of extremely low diversity, representing blind spots for studies of natural variation and complex traits, as well as hot spots of diversity. In contrast with the mosaic model we found that the majority of the genome has intermediate levels of variation of intrasubspecific origin. Finally, the wild-derived mouse strains that are supposed to represent different mouse subspecies show substantial intersubspecific introgression. This has serious implications for evolutionary studies that assume these are pure representatives of a given subspecies.Laboratory mice, the most popular model organism in mammalian genetics 1,2 , were derived from wild mice belonging to the Mus musculus species by an intricate process that included the generation of "fancy" mice in both Asia and Europe and a complex web of relationships among inbred strains 3 . Early studies demonstrated that the mitochondria and the Y chromosome present in many classical laboratory strains were derived from different subspecies, M. m. domesticus for the mitochondria and M. m. musculus for the Y chromosome 4,5 . Furthermore, the Y chromosome was introduced in the laboratory mouse through M. m. molossinus males 6 . Based on these findings, it was proposed that the genomes of inbred strains were a mosaic of regions with different subspecific origin 7 . Recently, the fine structure of such mosaic variation has been described 8 . This study reported that strain-to-strain comparisons revealed regions with extremely high variation spanning one third of the genome and regions with extremely low variation covering the remaining two thirds of the genome. This distinctively bimodal distribution was assumed to represent regions with the same and different subspecific origin. This mosaic model has been the driving concept behind mouse association mapping studies and haplotype analysis [9][10][11][12] . However, the origin of a given region of a laboratory strain could not be directly assigned to a subspecies due to the lack of reference sequences for the three major mouse subspecies. Subsequent studies raised questions regarding the haplotype structure 11,13 , the effect of ascertainment biases in subspecific assignment [14][15][16] and the contributions of intersubspecific
Background & Aims Cirrhosis and liver cancer are potential outcomes of advanced nonalcoholic fatty liver disease (NAFLD). It is not clear what factors determine whether patients will develop advanced or mild NAFLD, limiting non-invasive diagnosis and treatment before clinical sequelae emerge. We investigated whether DNA methylation profiles can distinguish patients with mild disease from those with advanced NAFLD, and how these patterns are functionally related to hepatic gene expression. Methods We collected frozen liver biopsies and clinical data from patients with biopsy-proven NAFLD (56 in the discovery cohort and 34 in the replication cohort). Samples were divided into groups based on histologic severity of fibrosis: F0–1 (mild) and F3–4 (advanced). DNA methylation profiles were determined and coupled with gene expression data from the same biopsies; differential methylation was validated in subsets of the discovery and replication cohorts. We then analyzed interactions between the methylome and transcriptome. Results Clinical features did not differ between patients known to have mild or advanced fibrosis based on biopsy analysis. There were 69,247 differentially methylated CpG sites (76% hypomethylated, 24% hypermethylated) in patients with advanced vs mild NAFLD (P<.05). Methylation at FGFR2, MAT1A, and CASP1 was validated by bisulfite pyrosequencing and the findings were reproduced in the replication cohort. Methylation correlated with gene transcript levels for 7% of differentially methylated CpG sites, indicating that differential methylation contributes to differences in expression. In samples with advanced NAFLD, many tissue repair genes were hypomethylated and overexpressed, whereas genes in certain metabolic pathways, including 1-carbon metabolism, were hypermethylated and under-expressed. Conclusions Functionally relevant differences in methylation can distinguish patients with advanced vs mild NAFLD. Altered methylation of genes that regulate processes such as steatohepatitis, fibrosis, and carcinogenesis indicate the role of DNA methylation in progression of NAFLD.
The Collaborative Cross Consortium reports here on the development of a unique genetic resource population. The Collaborative Cross (CC) is a multiparental recombinant inbred panel derived from eight laboratory mouse inbred strains. Breeding of the CC lines was initiated at multiple international sites using mice from The Jackson Laboratory. Currently, this innovative project is breeding independent CC lines at the University of North Carolina (UNC), at Tel Aviv University (TAU), and at Geniad in Western Australia (GND). These institutions aim to make publicly available the completed CC lines and their genotypes and sequence information. We genotyped, and report here, results from 458 extant lines from UNC, TAU, and GND using a custom genotyping array with 7500 SNPs designed to be maximally informative in the CC and used a novel algorithm to infer inherited haplotypes directly from hybridization intensity patterns. We identified lines with breeding errors and cousin lines generated by splitting incipient lines into two or more cousin lines at early generations of inbreeding. We then characterized the genome architecture of 350 genetically independent CC lines. Results showed that founder haplotypes are inherited at the expected frequency, although we also consistently observed highly significant transmission ratio distortion at specific loci across all three populations. On chromosome 2, there is significant overrepresentation of WSB/EiJ alleles, and on chromosome X, there is a large deficit of CC lines with CAST/EiJ alleles. Linkage disequilibrium decays as expected and we saw no evidence of gametic disequilibrium in the CC population as a whole or in random subsets of the population. Gametic equilibrium in the CC population is in marked contrast to the gametic disequilibrium present in a large panel of classical inbred strains. Finally, we discuss access to the CC population and to the associated raw data describing the genetic structure of individual lines. Integration of rich phenotypic and genomic data over time and across a wide variety of fields will be vital to delivering on one of the key attributes of the CC, a common genetic reference platform for identifying causative variants and genetic networks determining traits in mammals.
We designed a high-density mouse genotyping array containing 623,124 SNPs that capture the known genetic variation present in the laboratory mouse. The array also contains 916,269 invariant genomic probes that are targeted to functional elements and regions known to harbor segmental duplications. The array opens the door to the characterization of genetic diversity, copy number variation, allele specific gene expression and DNA methylation and will extend the successes of human genome-wide association studies to the mouse.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.