Supplementary data are available at Bioinformatics online.
Motivation: Tumor sequencing has entered an exciting phase with the advent of single-cell techniques that are revolutionizing the assessment of single nucleotide variation (SNV) at the highest cellular resolution. However, state-of-the-art single-cell sequencing technologies produce data with many missing bases (MBs) and incorrect base designations that lead to false-positive (FP) and false-negative (FN) detection of somatic mutations. While computational methods are available to make biological inferences in the presence of these errors, the accuracy of the imputed MBs and corrected FPs and FNs remains unknown. Results: Using computer simulated datasets, we assessed the robustness performance of four existing methods (OncoNEM, SCG, SCITE and SiFit) and one new method (BEAM). BEAM is a Bayesian evolution-aware method that improves the quality of single-cell sequences by using the intrinsic evolutionary information in the single-cell data in a molecular phylogenetic framework. Overall, BEAM and SCITE performed the best. Most of the methods imputed MBs with high accuracy, but effective detection and correction of FPs and FNs is a challenge, especially for small datasets. Analysis of an empirical dataset shows that computational methods can improve both the quality of tumor single-cell sequences and their utility for biological inference. In conclusion, tumor cells descend from pre-existing cells, which creates evolutionary continuity in single-cell sequencing datasets. This information enables BEAM and other methods to correctly impute missing data and incorrect base assignments, but correction of FPs and FNs remains challenging when the number of SNVs sampled is small relative to the number of cells sequenced.
tumors harbor extensive genetic heterogeneity in the form of distinct clone genotypes that arise over time and across different tissues and regions in cancer. Many computational methods produce clone phylogenies from population bulk sequencing data collected from multiple tumor samples from a patient. These clone phylogenies are used to infer mutation order and clone origins during tumor progression, rendering the selection of the appropriate clonal deconvolution method critical. Surprisingly, absolute and relative accuracies of these methods in correctly inferring clone phylogenies are yet to consistently assessed. Therefore, we evaluated the performance of seven computational methods. The accuracy of the reconstructed mutation order and inferred clone groupings varied extensively among methods. All the tested methods showed limited ability to identify ancestral clone sequences present in tumor samples correctly. The presence of copy number alterations, the occurrence of multiple seeding events among tumor sites during metastatic tumor evolution, and extensive intermixture of cancer cells among tumors hindered the detection of clones and the inference of clone phylogenies for all methods tested. Overall, CloneFinder, MACHINA, and LICHeE showed the highest overall accuracy, but none of the methods performed well for all simulated datasets. So, we present guidelines for selecting methods for data analysis.www.nature.com/scientificreports www.nature.com/scientificreports/ ignoring SNV frequencies by using a heuristic algorithm based on co-comparability graphs 38 . It addresses the minimum conflict-free row split problem, where row is tumor genotypes, and observed tumor genotypes are split into clone genotypes. Ultimately, all of these methods deconvolute individual clones from population bulk sequencing of multiple tumor samples acquired over time and/or different locations in a patient.Surprisingly, absolute and relative accuracies of clone phylogenies produced by these computational methods have not been assessed using the same collection of datasets, i.e., their performances are yet to be benchmarked. Such benchmarking is critical because of the biological relevance of the downstream inferences derived by using the results produced by these methods. For example, the accuracies of the order of driver mutations and the interrelationship of clones depend on the performance of current methods in accurately deconvoluting individual clone genotypes and reconstructing evolutionary events 13,34,36 . Accurate clone phylogenies are also critical for inferring migration paths. No previous study has evaluated the relative accuracy of clone phylogenies, because their focus has been on introducing and assessing the strengths of the new clone prediction method proposed 13,34-39 . Besides, the robustness of these computational methods to the complexity of clonal structures and evolutionary histories from different tumor sites is mostly unknown.Therefore, we evaluated the accuracy of clone phylogenies produced by seven methods t...
Motivation: Analyses of data generated from bulk sequencing of tumors have revealed extensive genomic heterogeneity within patients. Many computational methods have been developed to enable the inference of genotypes of tumor cell populations (clones) from bulk sequencing data. However, the relative and absolute accuracy of available computational methods in estimating clone counts and clone genotypes is not yet known. Results:We have assessed the performance of nine methods, including eight previously-published and one new method (CloneFinder), by analyzing computer simulated datasets. CloneFinder, LICHeE, CITUP, and cloneHD inferred clone genotypes with low error (<5% per clone) for a majority of datasets in which the tumor samples contained evolutionarily-related clones. Computational methods did not perform well for datasets in which tumor samples contained mixtures of clones from different clonal lineages. Generally, the number of clones was underestimated by cloneHD and overestimated by Phy-loWGS, and BayClone2, Canopy, and Clomial required prior information regarding the number of clones. AncesTree and Canopy did not produce results for a large number of datasets. Conclusions: Deconvolution of clone genotypes from single nucleotide variant (SNV) frequency differences among tumor samples remains challenging, so there is a need to develop more accurate computational methods and robust software for clone genotype inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2024 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.