2015
DOI: 10.1038/ng.3331
|View full text |Cite
|
Sign up to set email alerts
|

A method to predict the impact of regulatory variants from DNA sequence

Abstract: Most variants implicated in common human disease by Genome-Wide Association Studies (GWAS) lie in non-coding sequence intervals. Despite the suggestion that regulatory element disruption represents a common theme, identifying causal risk variants within indicted genomic regions remains a significant challenge. Here we present a novel sequence-based computational method to predict the effect of regulatory variation, using a classifier (gkm-SVM) which encodes cell-specific regulatory sequence vocabularies. The i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
528
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 452 publications
(535 citation statements)
references
References 44 publications
7
528
0
Order By: Relevance
“…To address this, we coupled experimental and bioinformatics approaches to show that mutations in GATA1 and CF BSs can substantially affect genes implicated in MEDs, and created mutation maps of predicted mutations across CREs proximal to MED genes (15,16). These maps may prove useful for prioritizing variants from WGS or targeted sequencing of MED cases, as we have initially shown for PKLR-RE1 mutations that disrupt a TAL1 binding motif, and can identify disruptive as well as gain-of-function mutations.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…To address this, we coupled experimental and bioinformatics approaches to show that mutations in GATA1 and CF BSs can substantially affect genes implicated in MEDs, and created mutation maps of predicted mutations across CREs proximal to MED genes (15,16). These maps may prove useful for prioritizing variants from WGS or targeted sequencing of MED cases, as we have initially shown for PKLR-RE1 mutations that disrupt a TAL1 binding motif, and can identify disruptive as well as gain-of-function mutations.…”
Section: Discussionmentioning
confidence: 99%
“…ChIP-seq and RNA-seq were processed as described previously (32). Random forests were used to model expression, k-means and PAM were used for clustering, and gkmer-SVM and DeepBind were used to create mutation maps (15,16). Complete details are available in SI Materials and Methods.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We computed a "delta" binding score for all possible point mutations within DAP binding sites, defined as the mean decrease in our SVMs' classifier value for the alternate base relative to the reference sequence. This strategy is similar to a previously developed approach, "deltaSVM," that focuses on local disruptions of 10-mer feature weights (Lee et al 2015). Bases with the most negative delta binding score tended to be the most highly conserved for most DAPs (Supplemental Table S10).…”
Section: Dap Binding Analyses To Prioritize Impactful Noncoding Variamentioning
confidence: 98%