There is a growing need to develop novel therapeutics for targeted treatment of cancer. The prerequisite to success is the knowledge about which types of molecular alterations are predominantly driving tumorigenesis. To shed light onto this subject, we have utilized the largest database of human cancer mutations–TCGA PanCanAtlas, multiple established algorithms for cancer driver prediction (2020plus, CHASMplus, CompositeDriver, dNdScv, DriverNet, HotMAPS, OncodriveCLUSTL, OncodriveFML) and developed four novel computational pipelines: SNADRIF (Single Nucleotide Alteration DRIver Finder), GECNAV (Gene Expression-based Copy Number Alteration Validator), ANDRIF (ANeuploidy DRIver Finder) and PALDRIC (PAtient-Level DRIver Classifier). A unified workflow integrating all these pipelines, algorithms and datasets at cohort and patient levels was created. We have found that there are on average 12 driver events per tumour, of which 0.6 are single nucleotide alterations (SNAs) in oncogenes, 1.5 are amplifications of oncogenes, 1.2 are SNAs in tumour suppressors, 2.1 are deletions of tumour suppressors, 1.5 are driver chromosome losses, 1 is a driver chromosome gain, 2 are driver chromosome arm losses, and 1.5 are driver chromosome arm gains. The average number of driver events per tumour increases with age (from 7 to 15) and cancer stage (from 10 to 15) and varies strongly between cancer types (from 1 to 24). Patients with 1 and 7 driver events per tumour are the most frequent, and there are very few patients with more than 40 events. In tumours having only one driver event, this event is most often an SNA in an oncogene. However, with increasing number of driver events per tumour, the contribution of SNAs decreases, whereas the contribution of copy-number alterations and aneuploidy events increases.
Elucidating crucial driver genes is paramount for understanding the cancer origins and mechanisms of progression, as well as selecting targets for molecular therapy. Cancer genes are usually ranked by the frequency of mutation, which, however, does not necessarily reflect their driver strength. Here we hypothesize that driver strength is higher for genes that are preferentially mutated in patients with few driver mutations overall, because these few mutations should be strong enough to initiate cancer. We propose a formula to calculate the corresponding Driver Strength Index (DSI), as well as the Normalized Driver Strength Index (NDSI), the latter completely independent of the overall gene mutation frequency. We validate these indices using the largest database of human cancer mutations - TCGA PanCanAtlas, multiple established algorithms for cancer driver prediction (2020plus, CHASMplus, CompositeDriver, dNdScv, DriverNet, HotMAPS, IntOGen Plus, OncodriveCLUSTL, OncodriveFML) and four custom computational pipelines that integrate driver contributions from SNA, CNA and aneuploidy at the patient-level resolution. We demonstrate that NDSI provides substantially different rankings of genes as compared to DSI and frequency approach. For example, NDSI highlighted the importance of guanine nucleotide-binding protein subunits GNAQ, GNA11, GNAI1, GNAZ and GNB3, General Transcription Factor II family members GTF2I and GTF2F2, as well as fibroblast growth factor receptors FGFR2 and FGFR3. Intriguingly, NDSI prioritized CIC, FUBP1, IDH1 and IDH2 mutations, as well as 19q and 1p chromosome arm losses, that comprise characteristic molecular alterations of gliomas. KEGG analysis shows that top NDSI-ranked genes comprise PDGFRA-GRB2-SOS2-HRAS/NRAS-BRAF pathway, GNAQ/GNA11-HRAS/NRAS-BRAF pathway, GNB3-AKT1-IKBKG/GSK3B/CDKN1B pathway and TCEB1-VHL pathway. NDSI does not seem to correlate with the number of protein-protein interactions. We share our software to enable calculation of DSI and NDSI for outputs of any third-party driver prediction algorithms or their combinations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.