With the rapidly developing high-throughput sequencing technologies known as next generation sequencing or NGS, our approach to gene hunting and diagnosis has drastically changed. In <10 years, these technologies have moved from gene panel to whole genome sequencing and from an exclusively research context to clinical practice. Today, the limit is not the sequencing of one, many or all genes but rather the data analysis. Consequently, the challenge is to rapidly and efficiently identify disease-causing mutations within millions of variants. To do so, we developed the VarAFT software to annotate and pinpoint human disease-causing mutations through access to multiple layers of information. VarAFT was designed both for research and clinical contexts and is accessible to all scientists, regardless of bioinformatics training. Data from multiple samples may be combined to address all Mendelian inheritance modes, cancers or population genetics. Optimized filtration parameters can be stored and re-applied to large datasets. In addition to classical annotations from dbNSFP, VarAFT contains unique features at the disease (OMIM), phenotypic (HPO), gene (Gene Ontology, pathways) and variation levels (predictions from UMD-Predictor and Human Splicing Finder) that can be combined to optimally select candidate pathogenic mutations. VarAFT is freely available at: http://varaft.eu.
Whole‐exome sequencing (WES) is increasingly applied to research and clinical diagnosis of human diseases. It typically results in large amounts of genetic variations. Depending on the mode of inheritance, only one or two correspond to pathogenic mutations responsible for the disease and present in affected individuals. Therefore, it is crucial to filter out nonpathogenic variants and limit downstream analysis to a handful of candidate mutations. We have developed a new computational combinatorial system UMD‐Predictor (http://umd‐predictor.eu) to efficiently annotate cDNA substitutions of all human transcripts for their potential pathogenicity. It combines biochemical properties, impact on splicing signals, localization in protein domains, variation frequency in the global population, and conservation through the BLOSUM62 global substitution matrix and a protein‐specific conservation among 100 species. We compared its accuracy with the seven most used and reliable prediction tools, using the largest reference variation datasets including more than 140,000 annotated variations. This system consistently demonstrated a better accuracy, specificity, Matthews correlation coefficient, diagnostic odds ratio, speed, and provided the shortest list of candidate mutations for WES. Webservices allow its implementation in any bioinformatics pipeline for next‐generation sequencing analysis. It could benefit to a wide range of users and applications varying from gene discovery to clinical diagnosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.