Accurate identification of tumor-derived somatic variants in plasma circulating cell-free DNA (cfDNA) requires understanding the various biologic compartments contributing to the cfDNA pool. We sought to define the technical feasibility of a high-intensity sequencing assay of cfDNA and matched white-blood cell (WBC) DNA covering a large genomic region (508 genes, 2Mb, >60,000X raw-depth) in a prospective study of 124 metastatic cancer patients, with contemporaneous matched tumor tissue biopsies, and 47 non-cancer controls. The assay displayed a high sensitivity and specificity, allowing for de novo detection of tumor-derived mutations and inference of tumor mutational burden, microsatellite instability, mutational signatures and sources of somatic mutations identified in cfDNA. The vast majority of cfDNA mutations (81.6% in controls and 53.2% in cancer patients) had features consistent with clonal hematopoiesis (CH). This cfDNA sequencing approach revealed that CH constitutes a pervasive biological phenomenon emphasizing the importance of matched cfDNA-WBC sequencing for accurate variant interpretation.
BackgroundHigh-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research. Many algorithms have been developed for somatic single nucleotide variant (SNV) detection in matched tumor-normal DNA sequencing. Although numerous studies have compared the performance of various algorithms on exome data, there has not yet been a systematic evaluation using PCR-enriched amplicon data with a range of variant allele fractions. The recently developed gold standard variant set for the reference individual NA12878 by the NIST-led “Genome in a Bottle” Consortium (NIST-GIAB) provides a good resource to evaluate admixtures with various SNV fractions.ResultsUsing the NIST-GIAB gold standard, we compared the performance of five popular somatic SNV calling algorithms (GATK UnifiedGenotyper followed by simple subtraction, MuTect, Strelka, SomaticSniper and VarScan2) for matched tumor-normal amplicon and exome sequencing data.ConclusionsWe demonstrated that the five commonly used somatic SNV calling methods are applicable to both targeted amplicon and exome sequencing data. However, the sensitivities of these methods vary based on the allelic fraction of the mutation in the tumor sample. Our analysis can assist researchers in choosing a somatic SNV calling method suitable for their specific needs.
Background: Noninvasive genotyping using plasma cell-free DNA (cfDNA) has the potential to obviate the need for some invasive biopsies in cancer patients while also elucidating disease heterogeneity. We sought to develop an ultra-deep plasma next-generation sequencing (NGS) assay for patients with non-small-cell lung cancers (NSCLC) that could detect targetable oncogenic drivers and resistance mutations in patients where tissue biopsy failed to identify an actionable alteration.Patients and methods: Plasma was prospectively collected from patients with advanced, progressive NSCLC. We carried out ultra-deep NGS using cfDNA extracted from plasma and matched white blood cells using a hybrid capture panel covering 37 lung cancer-related genes sequenced to 50 000Â raw target coverage filtering somatic mutations attributable to clonal hematopoiesis. Clinical sensitivity and specificity for plasma detection of known oncogenic drivers were calculated and compared with tissue genotyping results. Orthogonal ddPCR validation was carried out in a subset of cases.Results: In 127 assessable patients, plasma NGS detected driver mutations with variant allele fractions ranging from 0.14% to 52%. Plasma ddPCR for EGFR or KRAS mutations revealed findings nearly identical to those of plasma NGS in 21 of 22 patients, with high concordance of variant allele fraction (r ¼ 0.98). Blinded to tissue genotype, plasma NGS sensitivity for de novo plasma detection of known oncogenic drivers was 75% (68/91). Specificity of plasma NGS in those who were driver-negative by tissue NGS was 100% (19/19). In 17 patients with tumor tissue deemed insufficient for genotyping, plasma NGS identified four KRAS mutations. In 23 EGFR mutant cases with acquired resistance to targeted therapy, plasma NGS detected potential resistance mechanisms, including EGFR T790M and C797S mutations and ERBB2 amplification.Conclusions: Ultra-deep plasma NGS with clonal hematopoiesis filtering resulted in de novo detection of targetable oncogenic drivers and resistance mechanisms in patients with NSCLC, including when tissue biopsy was inadequate for genotyping.
BackgroundPCR amplicon sequencing has been widely used as a targeted approach for both DNA and RNA sequence analysis. High multiplex PCR has further enabled the enrichment of hundreds of amplicons in one simple reaction. At the same time, the performance of PCR amplicon sequencing can be negatively affected by issues such as high duplicate reads, polymerase artifacts and PCR amplification bias. Recently researchers have made some good progress in addressing these shortcomings by incorporating molecular barcodes into PCR primer design. So far, most work has been demonstrated using one to a few pairs of primers, which limits the size of the region one can analyze.ResultsWe developed a simple protocol, which enables the use of molecular barcodes in high multiplex PCR with hundreds of amplicons. Using this protocol and reference materials, we demonstrated the applications in accurate variant calling at very low fraction over a large region and in targeted RNA quantification. We also evaluated the protocol’s utility in profiling FFPE samples.ConclusionsWe demonstrated the successful implementation of molecular barcodes in high multiplex PCR, with multiplex scale many times higher than earlier work. We showed that the new protocol combines the benefits of both high multiplex PCR and molecular barcodes, i.e. the analysis of a very large region, low DNA input requirement, very good reproducibility and the ability to detect as low as 1 % mutations with minimal false positives (FP).Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1806-8) contains supplementary material, which is available to authorized users.
Accurate estimation of expression levels from RNA-Seq data entails precise mapping of the sequence reads to a reference genome. Because the standard reference genome contains only one allele at any given locus, reads overlapping polymorphic loci that carry a non-reference allele are at least one mismatch away from the reference and, hence, are less likely to be mapped. This bias in read mapping leads to inaccurate estimates of allele-specific expression (ASE). To address this read-mapping bias, we propose the construction of an enhanced reference genome that includes the alternative alleles at known polymorphic loci. We show that mapping to this enhanced reference reduced the read-mapping biases, leading to more reliable estimates of ASE. Experiments on simulated data show that the proposed strategy reduced the number of loci with mapping bias by ≥63% when compared with a previous approach that relies on masking the polymorphic loci and by ≥18% when compared with the standard approach that uses an unaltered reference. When we applied our strategy to actual RNA-Seq data, we found that it mapped up to 15% more reads than the previous approaches and identified many seemingly incorrect inferences made by them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.