Mammalian DNA replication is a highly organized and regulated process. Large, Mb-sized regions are replicated at defined times along S-phase. Replication Timing (RT) is thought to play a role in shaping the mammalian genome by affecting mutation rates. Previous analyses relied on somatic RT profiles. However, only germline mutations are passed on to offspring and affect genomic composition. Therefore, germ cell RT information is necessary to evaluate the influences of RT on the mammalian genome. We adapted the RT mapping technique for limited amounts of cells, and measured RT from two stages in the mouse germline - primordial germ cells (PGCs) and spermatogonial stem cells (SSCs). RT in germline cells exhibited stronger correlations to both mutation rate and recombination hotspots density than those of RT in somatic tissues, emphasizing the importance of using correct tissues-of-origin for RT profiling. Germline RT maps exhibited stronger correlations to additional genetic features including GC-content, transposable elements (SINEs and LINEs), and gene density. GC content stratification and multiple regression analysis revealed independent contributions of RT to SINE, gene, mutation, and recombination hotspot densities. Together, our results establish a central role for RT in shaping multiple levels of mammalian genome composition.
Cancer somatic mutations are the product of multiple mutational and repair processes, both of which are tightly associated with DNA replication. Distinctive patterns of somatic mutation accumulation, termed mutational signatures, are indicative of processes sustained within tumors. However, the association of various mutational processes with replication timing (RT) remains an open question. In this study, we systematically analyzed the mutational landscape of 2,787 tumors from 32 tumor types separately for early and late replicating regions using sequence context normalization and chromatin data to account for sequence and chromatin accessibility differences. To account for sequence differences between various genomic regions, an artificial genome–based approach was developed to expand the signature analyses to doublet base substitutions and small insertions and deletions. The association of mutational processes and RT was signature specific: Some signatures were associated with early or late replication (such as SBS7b and SBS7a, respectively), and others had no association. Most associations existed even after normalizing for genome accessibility. A focused mutational signature identification approach was also developed that uses RT information to improve signature identification; this approach found that SBS16, which is biased toward early replication, is strongly associated with better survival rates in liver cancer. Overall, this novel and comprehensive approach provides a better understanding of the etiology of mutational signatures, which may lead to improved cancer prevention, diagnosis, and treatment. Significance: Many mutational processes associate with early or late replication timing regions independently of chromatin accessibility, enabling development of a focused identification approach to improve mutational signature detection.
Avoiding biases in next generation sequencing (NGS) library preparation is crucial for obtaining reliable sequencing data. Recently, a new library preparation method has been introduced which has eliminated the need for the ligation step. This method, termed SMART (switching mechanism at the 5′ end of the RNA transcript), is based on template switching reverse transcription. To date, there has been no systematic analysis of the additional biases introduced by this method. We analysed the genomic distribution of sequenced reads prepared from genomic DNA using the SMART methodology and found a strong bias toward long (≥12bp) poly dA/dT containing genomic loci. This bias is unique to the SMART-based library preparation and does not appear when libraries are prepared with conventional ligation based methods. Although this bias is obvious only when performing paired end sequencing, it affects single end sequenced samples as well. Our analysis demonstrates that sequenced reads originating from SMART-DNA libraries are heavily skewed toward genomic poly dA/dT tracts. This bias needs to be considered when deciding to use SMART based technology for library preparation.
Stochastic asynchronous replication timing (AS-RT) is a phenomenon in which the time of replication of each allele is different, and the identity of the early allele varies between cells. By taking advantage of stable clonal pre-B cell populations derived from C57BL6/Castaneous mice, we have mapped the genome-wide AS-RT loci, independently of genetic differences. These regions are characterized by differential chromatin accessibility, mono-allelic expression and include new gene families involved in specifying cell identity. By combining population level mapping with single cell FISH, our data reveal the existence of a novel regulatory program that coordinates a fixed relationship between AS-RT regions on any given chromosome, with some loci set to replicate in a parallel and others set in the anti-parallel orientation. Our results show that AS-RT is a highly regulated epigenetic mark established during early embryogenesis that may be used for facilitating the programming of mono-allelic choice throughout development.
BackgroundCancer somatic mutations are the product of multiple mutational and repair processes, which are tightly associated with DNA replication. Distinctive patterns of somatic mutations accumulation in tumors, termed mutational signatures, are indicative of processes the tumors underwent. While tumor mutational load is correlated with late replicating regions and spatial genome organization, much is unknown about the association of many different mutational processes and replication timing, and the interplay with chromatin structure remains an open question.MethodsWe systematically analyzed the mutational landscape of 2,787 WGS tumors from 32 different tumor types separately for early and late replicating regions. We used sequence context normalization and chromatin data to account for sequence and chromatin accessibility differences between early and late replicating regions. Moreover, we expanded the signature analyses to doublet base substitutions and small insertions and deletions by developing an artificial genomes-based approach to account for sequence differences between various genomic regions.ResultsWe revealed the replication timing (RT) association of single base, doublet base and small insertions and deletions mutational signatures. The association is signature specific: some are associated with early or late replication (such as UV-exposure signatures SBS7b and SBS7a, respectively) and others have no association. Most associations exist even after normalizing for genome accessibility. We further developed a focused mutational signature identification approach, which uses RT information to improve signature identification, and found that SBS16, which is biased towards early replication, is strongly associated with better survival rates in liver cancer.ConclusionsOur comprehensive analyses enabled a more robust classification of RT association of single base, doublet base and indels signatures. By doing so, we demonstrated a variation in the association with RT, as many mutational processes biased towards either early or late replication timing, and others have an equal RT distribution. These associations were independent from chromatin accessibility in most cases. This work highlights that restricting signatures analyses to concise genomic regions improves identification of signatures, such as SBS16, and demonstrates its clinically relevance as a predictor of improved survival of liver cancer patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.