As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.
Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXX trunc) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXX trunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer.
Background Establishment of telomere maintenance mechanisms is a universal step in tumor development to achieve replicative immortality. These processes leave molecular footprints in cancer genomes in the form of altered telomere content and aberrations in telomere composition. To retrieve these telomere characteristics from high-throughput sequencing data the available computational approaches need to be extended and optimized to fully exploit the information provided by large scale cancer genome data sets. Results We here present TelomereHunter, a software for the detailed characterization of telomere maintenance mechanism footprints in the genome. The tool is implemented for the analysis of large cancer genome cohorts and provides a variety of diagnostic diagrams as well as machine-readable output for subsequent analysis. A novel key feature is the extraction of singleton telomere variant repeats, which improves the identification and subclassification of the alternative lengthening of telomeres phenotype. We find that whole genome sequencing-derived telomere content estimates strongly correlate with telomere qPCR measurements (r = 0.94). For the first time, we determine the correlation of in silico telomere content quantification from whole genome sequencing and whole genome bisulfite sequencing data derived from the same tumor sample (r = 0.78). An analogous comparison of whole exome sequencing data and whole genome sequencing data measured slightly lower correlation (r = 0.79). However, this is considerably improved by normalization with matched controls (r = 0.91). Conclusions TelomereHunter provides new functionality for the analysis of the footprints of telomere maintenance mechanisms in cancer genomes. Besides whole genome sequencing, whole exome sequencing and whole genome bisulfite sequencing are suited for in silico telomere content quantification, especially if matched control samples are available. The software runs under a GPL license and is available at https://www.dkfz.de/en/applied-bioinformatics/telomerehunter/telomerehunter.html . Electronic supplementary material The online version of this article (10.1186/s12859-019-2851-0) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.