The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD (http://www.hgmd.org) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.
The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that are thought to underlie, or are closely associated with human inherited disease. At the time of writing (June 2020), the database contains in excess of 289,000 different gene lesions identified in over 11,100 genes manually curated from 72,987 articles published in over 3100 peer-reviewed journals. There are primarily two main groups of users who utilise HGMD on a regular basis; research scientists and clinical diagnosticians. This review aims to highlight how to make the most out of HGMD data in each setting.
It has long been known that canonical 5′ splice site (5′SS) GT>GC variants may be compatible with normal splicing. However, to date, the actual scale of canonical 5′SSs capable of generating wild‐type transcripts in the case of GT>GC substitutions remains unknown. Herein, combining data derived from a meta‐analysis of 45 human disease‐causing 5′SS GT>GC variants and a cell culture‐based full‐length gene splicing assay of 103 5′SS GT>GC substitutions, we estimate that ~15–18% of canonical GT 5′SSs retain their capacity to generate between 1% and 84% normal transcripts when GT is substituted by GC. We further demonstrate that the canonical 5′SSs in which substitution of GT by GC‐generated normal transcripts exhibit stronger complementarity to the 5′ end of U1 snRNA than those sites whose substitutions of GT by GC did not lead to the generation of normal transcripts. We also observed a correlation between the generation of wild‐type transcripts and a milder than expected clinical phenotype but found that none of the available splicing prediction tools were capable of reliably distinguishing 5′SS GT>GC variants that generated wild‐type transcripts from those that did not. Our findings imply that 5′SS GT>GC variants in human disease genes may not invariably be pathogenic.
We describe a novel approach, selectively amplified microsatellite (SAM) analysis, for the targeted development of informative simple sequence repeat (SSR) markers. A modified selectively amplified microsatellite polymorphic loci assay is used to generate multi-locus SSR fingerprints that provide a source of polymorphic DNA markers (SAMs) for use in genetic studies. These polymorphisms capture the repeat length variation associated with SSRs and allow their chromosomal location to be determined prior to the expense of isolating and characterising individual loci. SAMs can then be converted to locus-specific SSR markers with the design and synthesis of a single primer specific to the conserved region flanking the repeat. This approach offers a cost-efficient and rapid method for developing SSR markers for predetermined chromosomal locations and of potential informativeness. The high recovery rate of useful SSR markers makes this strategy a valuable tool for population and genetic mapping studies. The utility of SAM analysis was demonstrated by the development of SSR markers in bread wheat.
Introduction: 5' splice site GT>GC or +2T>C variants have been frequently reported to cause human genetic disease and are routinely scored as pathogenic splicing mutations. However, we have recently demonstrated that such variants in human disease genes may not invariably be pathogenic. Moreover, we found that no splicing prediction tools appear to be capable of reliably distinguishing those +2T>C variants that generate wild-type transcripts from those that do not. Methodology: Herein, we evaluated the performance of a novel deep learning-based tool, SpliceAI, in the context of three datasets of +2T>C variants, all of which had been characterized functionally in terms of their impact on pre-mRNA splicing. The first two datasets refer to our recently described “in vivo” dataset of 45 known disease-causing +2T>C variants and the “in vitro” dataset of 103 +2T>C substitutions subjected to full-length gene splicing assay. The third dataset comprised 12 BRCA1 +2T>C variants that were recently analyzed by saturation genome editing. Results: Comparison of the SpliceAI-predicted and experimentally obtained functional impact assessments of these variants (and smaller datasets of +2T>A and +2T>G variants) revealed that although SpliceAI performed rather better than other prediction tools, it was still far from perfect. A key issue was that the impact of those +2T>C (and +2T>A) variants that generated wild-type transcripts represents a quantitative change that can vary from barely detectable to an almost full expression of wild-type transcripts, with wild-type transcripts often co-existing with aberrantly spliced transcripts. Conclusion: Our findings highlight the challenges that we still face in attempting to accurately identify splice-altering variants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.