The vast majority of cancer next-generation sequencing data consist of bulk samples composed of mixtures of cancer and normal cells. To study tumor evolution, subclonal reconstruction approaches based on machine learning are used to separate subpopulation of cancer cells and reconstruct their ancestral relationships. However, current approaches are entirely data-driven and agnostic to evolutionary theory. We demonstrate that systematic errors occur in subclonal reconstruction if tumor evolution is not accounted for, and that those errors increase when multiple samples are taken from the same tumor. To address this issue, we present a novel approach for model-based subclonal reconstruction that combines data-driven machine learning with evolutionary theory. Using public, synthetic and newly generated data, we show the method is more robust and accurate than current techniques in both single-sample and multi-region sequencing data. With careful data curation and interpretation, we show how the method allows minimizing the confounding factors that affect non-evolutionary methods, leading to a more accurate recovery of the evolutionary history of human tumors..
The evolutionary events that cause colorectal adenomas (benign) to progress to carcinomas (malignant) remain largely undetermined. Using multi-region genome and exome sequencing of 24 benign and malignant colorectal tumours, we investigate the evolutionary fitness landscape occupied by these neoplasms. Unlike carcinomas, advanced adenomas frequently harbour sub-clonal driver mutations-considered to be functionally important in the carcinogenic process-that have not swept to fixation, and have relatively high genetic heterogeneity. Carcinomas are distinguished from adenomas by widespread aneusomies that are usually clonal and often accrue in a 'punctuated' fashion. We conclude that adenomas evolve across an undulating fitness landscape, whereas carcinomas occupy a sharper fitness peak, probably owing to stabilizing selection.
Evolutionary genomic analysis revealed precancer clones bearing extensive SNAs and CNAs, with progression to cancer involving a dramatic accrual of CNAs at HGD. Detection of the cancerised field is an encouraging prospect for surveillance, but punctuated evolution may limit the window for early detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.