The dispensability of individual genes for viability has interested generations of geneticists. For some genes it is essential to maintain two functional chromosomal copies, while others may tolerate the loss of one or both copies. Exome sequence data from 60,706 individuals provide sufficient observations of rare protein truncating variants (PTVs) to make genome-wide estimates of selection against heterozygous loss of gene function. The cumulative frequency of rare deleterious PTVs is primarily determined by the balance between incoming mutations and purifying selection rather than genetic drift. This enables the estimation of the genome-wide distribution of selection coefficients for heterozygous PTVs and corresponding Bayesian estimates for individual genes. The strength of selection can discriminate the severity, age of onset, and mode of inheritance in Mendelian exome sequencing cases. We find that genes under the strongest selection are enriched in embryonic lethal mouse knockouts, putatively cell-essential genes, Mendelian disease genes, and regulators of transcription. Screening by essentiality, we find a large set of genes under strong selection that likely have critical function but have not yet been extensively annotated in published literature.
Cancer genomes contain large numbers of somatic mutations, but few of these mutations drive tumor development. Current approaches identify driver genes based on mutational recurrence, or they approximate the functional consequences of nonsynonymous mutations using bioinformatic scores. While passenger mutations are enriched in characteristic nucleotide contexts, driver mutations occur in functional positions, which are not necessarily surrounded by a particular nucleotide context. We observed that mutations in contexts that deviate from the characteristic contexts around passenger mutations provide a signal in favor of driver genes. We therefore developed a method that combines this feature with the signals traditionally used for driver gene identification. We applied our method to whole-exome sequencing data from 11,873 tumor-normal pairs and identified 460 driver genes that clustered into 21 cancer-related pathways. Our study provides a resource of driver genes across 28 tumor types with additional driver genes identified based on mutations in unusual nucleotide contexts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.