Accurate identification of tumor-derived somatic variants in plasma circulating cell-free DNA (cfDNA) requires understanding the various biologic compartments contributing to the cfDNA pool. We sought to define the technical feasibility of a high-intensity sequencing assay of cfDNA and matched white-blood cell (WBC) DNA covering a large genomic region (508 genes, 2Mb, >60,000X raw-depth) in a prospective study of 124 metastatic cancer patients, with contemporaneous matched tumor tissue biopsies, and 47 non-cancer controls. The assay displayed a high sensitivity and specificity, allowing for de novo detection of tumor-derived mutations and inference of tumor mutational burden, microsatellite instability, mutational signatures and sources of somatic mutations identified in cfDNA. The vast majority of cfDNA mutations (81.6% in controls and 53.2% in cancer patients) had features consistent with clonal hematopoiesis (CH). This cfDNA sequencing approach revealed that CH constitutes a pervasive biological phenomenon emphasizing the importance of matched cfDNA-WBC sequencing for accurate variant interpretation.
A useful definition of 'big data' is data that is too big to process comfortably on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single-machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART). AbstractA useful definition of "big data" is data that is too big to comfortably process on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART).
Background: Noninvasive genotyping using plasma cell-free DNA (cfDNA) has the potential to obviate the need for some invasive biopsies in cancer patients while also elucidating disease heterogeneity. We sought to develop an ultra-deep plasma next-generation sequencing (NGS) assay for patients with non-small-cell lung cancers (NSCLC) that could detect targetable oncogenic drivers and resistance mutations in patients where tissue biopsy failed to identify an actionable alteration.Patients and methods: Plasma was prospectively collected from patients with advanced, progressive NSCLC. We carried out ultra-deep NGS using cfDNA extracted from plasma and matched white blood cells using a hybrid capture panel covering 37 lung cancer-related genes sequenced to 50 000Â raw target coverage filtering somatic mutations attributable to clonal hematopoiesis. Clinical sensitivity and specificity for plasma detection of known oncogenic drivers were calculated and compared with tissue genotyping results. Orthogonal ddPCR validation was carried out in a subset of cases.Results: In 127 assessable patients, plasma NGS detected driver mutations with variant allele fractions ranging from 0.14% to 52%. Plasma ddPCR for EGFR or KRAS mutations revealed findings nearly identical to those of plasma NGS in 21 of 22 patients, with high concordance of variant allele fraction (r ¼ 0.98). Blinded to tissue genotype, plasma NGS sensitivity for de novo plasma detection of known oncogenic drivers was 75% (68/91). Specificity of plasma NGS in those who were driver-negative by tissue NGS was 100% (19/19). In 17 patients with tumor tissue deemed insufficient for genotyping, plasma NGS identified four KRAS mutations. In 23 EGFR mutant cases with acquired resistance to targeted therapy, plasma NGS detected potential resistance mechanisms, including EGFR T790M and C797S mutations and ERBB2 amplification.Conclusions: Ultra-deep plasma NGS with clonal hematopoiesis filtering resulted in de novo detection of targetable oncogenic drivers and resistance mechanisms in patients with NSCLC, including when tissue biopsy was inadequate for genotyping.
In a communication network, point-to-point traffic volumes over time are critical for designing protocols that route information efficiently and for maintaining security, whether at the scale of an Internet service provider or within a corporation. While technically feasible, the direct measurement of point-to-point traffic imposes a heavy burden on network performance and is typically not implemented. Instead, indirect aggregate traffic volumes are routinely collected. We consider the problem of estimating point-to-point traffic volumes, x t , from aggregate traffic volumes, y t , given information about the network routing protocol encoded in a matrix A. This estimation task can be reformulated as finding the solutions to a sequence of ill-posed linear inverse problems, y t = A x t , since the number of origin-destination routes of interest is higher than the number of aggregate measurements available.Here, we introduce a novel multilevel state-space model (SSM) of aggregate traffic volumes with realistic features. We implement a naïve strategy for estimating unobserved point-to-point traffic volumes from indirect measurements of aggregate traffic, based on particle filtering. We then develop a more efficient two-stage inference strategy that relies on model-based regularization: a simple model is used to calibrate regularization parameters that lead to efficient/scalable inference in the multilevel SSM. We apply our methods to corporate and academic networks, where we show that the proposed inference strategy outperforms existing approaches and scales to larger networks. We also design a simulation study to explore the factors that influence the performance. Our results suggest that model-based regularization may be an efficient strategy for inference in other complex multilevel models. Supplementary materials for this article are available online.
Technological advances in passive digital phenotyping present the opportunity to quantify neurological diseases using new approaches that may complement clinical assessments. Here, we studied multiple sclerosis (MS) as a model neurological disease for investigating physiometric and environmental signals. The objective of this study was to assess the feasibility and correlation of wearable biosensors with traditional clinical measures of disability both in clinic and in free-living in MS patients. This is a single site observational cohort study conducted at an academic neurological center specializing in MS. A cohort of 25 MS patients with varying disability scores were recruited. Patients were monitored in clinic while wearing biosensors at nine body locations at three separate visits. Biosensor-derived features including aspects of gait (stance time, turn angle, mean turn velocity) and balance were collected, along with standardized disability scores assessed by a neurologist. Participants also wore up to three sensors on the wrist, ankle, and sternum for 8 weeks as they went about their daily lives. The primary outcomes were feasibility, adherence, as well as correlation of biosensor-derived metrics with traditional neurologist-assessed clinical measures of disability. We used machine-learning algorithms to extract multiple features of motion and dexterity and correlated these measures with more traditional measures of neurological disability, including the expanded disability status scale (EDSS) and the MS functional composite-4 (MSFC-4). In free-living, sleep measures were additionally collected. Twenty-three subjects completed the first two of three in-clinic study visits and the 8-week free-living biosensor period. Several biosensor-derived features significantly correlated with EDSS and MSFC-4 scores derived at visit two, including mobility stance time with MSFC-4 z-score (Spearman correlation −0.546; p = 0.0070), several aspects of turning including turn angle (0.437; p = 0.0372), and maximum angular velocity (0.653; p = 0.0007). Similar correlations were observed at subsequent clinic visits, and in the free-living setting. We also found other passively collected signals, including measures of sleep, that correlated with disease severity. These findings demonstrate the feasibility of applying passive biosensor measurement techniques to monitor disability in MS patients both in clinic and in the free-living setting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.