2012
DOI: 10.1093/nar/gkr1257
|View full text |Cite
|
Sign up to set email alerts
|

A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases

Abstract: Exome sequencing strategy is promising for finding novel mutations of human monogenic disorders. However, pinpointing the casual mutation in a small number of samples is still a big challenge. Here, we propose a three-level filtration and prioritization framework to identify the casual mutation(s) in exome sequencing studies. This efficient and comprehensive framework successfully narrowed down whole exome variants to very small numbers of candidate variants in the proof-of-concept examples. The proposed frame… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
211
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 236 publications
(213 citation statements)
references
References 29 publications
2
211
0
Order By: Relevance
“…The rare variants were defined as the variants with a MAF of ≤0.01 in any of the databases in the 1000 Genomes Project (all population, Asian population, European population, and African population) and ESP6500. We defined the deleterious variants as the loss-offunction (stopgain, frameshift insertion, and frameshift deletion) variants, nonframeshift INDELs, or missense variants that are predicted to be damaging by KGGSEq (21). KGGSeq combines the prediction scores from five algorithms (SIFT, Polyphen-2, LRT, MutationTaster, and PhyloP) by the logistic regression method to estimate a probability of a variant being pathogenic.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The rare variants were defined as the variants with a MAF of ≤0.01 in any of the databases in the 1000 Genomes Project (all population, Asian population, European population, and African population) and ESP6500. We defined the deleterious variants as the loss-offunction (stopgain, frameshift insertion, and frameshift deletion) variants, nonframeshift INDELs, or missense variants that are predicted to be damaging by KGGSEq (21). KGGSeq combines the prediction scores from five algorithms (SIFT, Polyphen-2, LRT, MutationTaster, and PhyloP) by the logistic regression method to estimate a probability of a variant being pathogenic.…”
Section: Methodsmentioning
confidence: 99%
“…They are located at the evolutionally conserved residues (SI Appendix, Fig. S4) and are predicted to be pathogenic by KGGSEq (21). Out of five variants identified in the EAO cases, two variants (c.G979A:p.A327T and c.G917A: p.R306H) are only 62 bp apart.…”
Section: Gene-level Analysis Identified the Association Of Mst1r Delementioning
confidence: 99%
“…GATK version 2.4.7 (Broad Institute, Cambridge, MA, USA) with the UnifiedGenotyper algorithm was applied for variant calling including all steps mentioned in the best practice pipeline. 11 KGG-seq 12 was used for annotation of detected variants, but in-house scripts were applied for filtering based on family pedigree and intersections.…”
Section: Exome Sequencingmentioning
confidence: 99%
“…The prediction scores of some of these methods have been compiled in the dbNSFP database for all known protein coding genome positions 4 . Besides, Li and colleagues proposed to combine five of them in a logistic regression framework 5 in order to globally improve predictive performance in comparison with individual scores. The idea of this contribution is twofold.…”
Section: Introductionmentioning
confidence: 99%
“…The idea of this contribution is twofold. First, we propose to combine the scores of different functional predictors using a machine learning method (random forest) that should be more suited to the nature of the problem than the logistic regression framework of Li and colleagues 5 . The performance of this method will be compared to the five models taken separately, to a logistic regression framework and to the recently published CADD method 6 .…”
Section: Introductionmentioning
confidence: 99%