Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that resolve asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of LINE repeats. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.
Instability of repetitive DNA sequences causes numerous hereditary disorders in humans, the majority of which are associated with trinucleotide repeat expansions. Here we describe a unique system to study instability of triplet repeats in a yeast experimental settings. Using fluctuation assay and the novel program FluCalc we are able to accurately estimate the rates of large-scale expansions, as well as repeat-mediated mutagenesis and gross chromosomal rearrangements for different repeat sequences.
SUMMARY
Expansions of simple DNA repeats cause numerous hereditary disorders in humans. Replication, repair and transcription are implicated in the expansion process, but their relative contributions are yet to be distinguished. To separate the role of replication and transcription in the expansion of Friedreich’s ataxia (GAA)n repeats, we designed two yeast genetic systems that utilize a galactose-inducible GAL1 promoter, but contain these repeats in either the transcribed or non-transcribed region of a selectable cassette. We found that large-scale repeat expansions can occur in the lack of transcription. Induction of transcription strongly elevated the rate of expansions in both systems, indicating that active transcriptional state rather than transcription through the repeat per se affects this process. Furthermore, replication defects increased the rate of repeat expansions irrespective of transcriptional state. We present a model where transcriptional state, linked to the nucleosomal density of a region, acts as a modulator of large-scale repeat expansions.
Improper DNA double-strand break (DSB) repair results in complex genomic rearrangements (CGRs) in many cancers and various congenital disorders in humans. Trinucleotide repeat sequences, such as (GAA)n repeats in Friedreich's ataxia, (CTG)n repeats in myotonic dystrophy, and (CGG)n repeats in fragile X syndrome, are also subject to double-strand breaks within the repetitive tract followed by DNA repair. Mapping the outcomes of CGRs is important for understanding their causes and potential phenotypic effects. However, high-resolution mapping of CGRs has traditionally been a laborious and highly skilled process. Recent advances in long-read DNA sequencing technologies, specifically Nanopore sequencing, have made possible the rapid identification of CGRs with single base pair resolution. Here, we have used whole-genome Nanopore sequencing to characterize several CGRs that originated from naturally occurring DSBs at (GAA)n microsatellites in Saccharomyces cerevisiae. These data gave us important insights into the mechanisms of DSB repair leading to CGRs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.