Short tandem repeats (STRs) are a class of rapidly mutating genetic elements characterized by repeated units of 1 or more nucleotides. We leveraged whole genome sequencing data for 152 recombinant inbred (RI) strains from the BXD family derived from C57BL/6J and DBA/2J mice to study the effects of genetic background on genome-wide patterns of new mutations at STRs. We defined quantitative phenotypes describing the numbers and types of germline STR mutations in each strain and identified a locus on chromosome 13 associated with the propensity of STRs to expand. Several dozen genes lie in the QTL region, including Msh3, a known modifier of STR stability at pathogenic repeat expansions in mice and humans. Detailed analysis of the locus revealed a cluster of variants near the 5' end of Msh3, including multiple protein-coding variants within the DNA mismatch recognition domain of MSH3, and a retrotransposon insertion overlapping an annotated exon. Additionally, gene expression analysis demonstrates co-localization of this QTL with expression QTLs for multiple nearby genes, including Msh3. Our results suggest a novel role for Msh3 in regulating genome-wide patterns of germline STR mutations and demonstrate that inherited genetic variation can contribute to variability in accumulation of new mutations across individuals.
Linked-read whole genome sequencing methods, such as the 10x Chromium, attach a unique molecular barcode to each high molecular weight DNA molecule. The samples are then sequenced using short-read technology. During analysis, sequence reads sharing the same barcode are aligned to adjacent genomic locations. The pattern of barcode sharing between genomic regions allows the discovery of large structural variants (SVs) in the range of 1 Kb to a few Mb. Most SV calling methods for these data, such as LongRanger, analyze one sample at a time and often produces inconsistent results for the same genomic location across multiple samples. We developed a method, SVJAM, for joint calling of SVs, using data from 152 members of the BXD family of recombinant inbred strains of mice. Our method first collects candidate SV regions from single sample analysis, such as those produced by LongRanger. We then retrieve barcode overlapping data from all samples for each region. These data are organized as a high dimensional matrix. The dimension of this matrix is then reduced using principal component analysis. Samples projected onto a two dimensional space formed by the first two principal components forms two or three clusters based on their genotype, representing the reference, alternative, or heterozygotic alleles. We developed a novel distance measure for hierarchical clustering and rotating the axes to find the optimal clustering results. We also developed an algorithm to decide whether the pattern of sample distribution is best fitted with one, two, or three genotypes. For each sample, we calculate its membership score for each genotype. We compared results produced by SVJAM with LongRanger and few methods that rely on PacBio or Oxford Nanopore data. In a comparison of SVJAM with SV detected using long-read sequencing data for the DBA/2J strain, we found that our results recovered many SVs missed by LongRanger. We also found many SVs called by LongRanger were assigned with an incorrect SV type. Our algorithm also consistently identified heterozygotic regions.
Triple negative breast cancer (TNBC) is an aggressive breast cancer subtype with poor outcomes. This is a grave clinical challenge for the ~30,000 patients diagnosed with this disease every year. Discovering genetic modifiers of differential TNBC vulnerability and disease progression is critical to improving predictive and personalized treatments. We hypothesized that using a well-established recombinant inbred strain, novel genetic modifiers of TNBC risk and aggression will be identified. The C3(1)-T antigen (C3Tag) genetically engineered mouse model (GEMM) recapitulates many facets of human basal-like TNBC to demonstrate promoting effects of exposures on tumor phenotypes. However, GEMM are highly constrained by their inbred genotype and do not allow a robust interrogation of the manner in which individual genetic variation might impact tumor initiation, progression, and response to therapy. Therefore, we developed a novel murine model of TNBC in the background of the largest and best characterized genetic reference population. Systems genetics is used to identify gene candidates. Cross-species comparison of our findings with publicly available human GWAS and genomic databases is an effective approach to validate conserved biologically relevant and targetable pathways. To our knowledge, this is the first study to explore modifier genes for TNBC phenotypes using a systems genetics approach in a GEMM for TNBC. Our results will contribute to significant advances in understanding risk and improving outcomes for breast cancer. Citation Format: Laura M. Sipe, Emily B. Korba, Lu Lu, Robert W. Williams, David G. Ashbrook, Liza Makowski. Novel pre-clinical model to identify genetic modifiers of triple negative breast cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2919.
Recombinant inbred rodents form immortal genome-types that can be resampled deeply at many stages, in both sexes, and under multiple experimental conditions to model genome-environment interactions and to test genome-phenome predictions. This allows for experimental precision medicine, for which sophisticated causal models of complex interactions among DNA variants, phenotype variants at many levels, and innumerable environmental factors are required. Large families and populations of isogenic lines of mice and rats are now available and have been used across fields of biology. We will use the BXD recombinant inbred family and their derived diallel cross population as an example for predictive, experimental precision medicine and biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.