Despite the identification of many genetic variants contributing to human disease (the 'disease genome'), establishing reliable molecular diagnoses remain challenging in many cases. The ability to sequence the genomes of patients has been transformative, but difficulty in interpretation of voluminous genetic variation often confounds recognition of underlying causal variants. There are numerous predictors of pathogenicity for individual DNA variants, but their utility is reduced because many plausibly pathogenic variants are probably neutral. The rapidly increasing quantity and quality of information on the properties of genes suggests that gene-specific information might be useful for prediction of causal variation when used alongside variant-specific predictors of pathogenicity. The key to understanding the role of genes in disease relates in part to gene essentiality, which has recently been approximated, for example, by quantifying the degree of intolerance of individual genes to loss-of-function variation. Increasing understanding of the interplay between genetic recombination, selection and mutation and their relationship to gene essentiality suggests that gene-specific information may be useful for the interpretation of sequenced genomes. Considered alongside additional distinctive properties of the disease genome, such as the timing of the evolutionary emergence of genes and the roles of their products in protein networks, the case for using gene-specific measures to guide filtering of sequenced genomes seems strong.
The evolution of next-generation sequencing (NGS) technologies has facilitated the detection of causal genetic variants in diseases previously undiagnosed at a molecular level. However, in genome sequencing studies, the identification of disease genes among a candidate gene list is often difficult because of the large number of apparently damaging (but usually neutral) variants. A number of variant prioritization tools have been developed to help detect diseasecausal sites. However, the results may be misleading as many variants scored as damaging by these tools are often tolerated, and there are inconsistencies in prediction results among the different variant-level prediction tools. Recently, studies have indicated that understanding gene properties might improve detection of genes liable to have associated disease variation and that this information improves molecular diagnostics. The purpose of this systematic review is to evaluate how understanding gene-specific properties might improve filtering strategies in clinical sequence data to prioritise potential disease variants. Improved understanding of the "disease genome", which includes coding, non-coding and regulatory variation, might help resolve difficult cases. This review provides a comprehensive assessment of existing gene-level approaches, the relationships between measures of genepathogenicity and how use of these prediction tools can be developed for molecular diagnostics.
The causal genetic variants underlying more than 50% of single gene (monogenic) disorders are yet to be discovered. Many patients with conditions likely to have a monogenic basis do not receive a confirmed molecular diagnosis which has potential impacts on clinical management. We have developed a gene-specific score, essentiality-specific pathogenicity prioritization (ESPP), to guide the recognition of genes likely to underlie monogenic disease variation to assist in filtering of genome sequence data. When a patient genome is sequenced, there are frequently several plausibly pathogenic variants identified in different genes. Recognition of the single gene most likely to include pathogenic variation can guide the identification of a causal variant. The ESPP score integrates gene-level scores which are broadly related to gene essentiality. Previous work towards the recognition of monogenic disease genes proposed a model with increasing gene essentiality from ‘non-essential’ to ‘essential’ genes (for which pathogenic variation may be incompatible with survival) with genes liable to contain disease variation positioned between these two extremes. We demonstrate that the ESPP score is useful for recognizing genes with high potential for pathogenic disease-related variation. Genes classed as essential have particularly high scores, as do genes recently recognized as strong candidates for developmental disorders. Through the integration of individual gene-specific scores, which have different properties and assumptions, we demonstrate the utility of an essentiality-based gene score to improve sequence genome filtering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.