2019
DOI: 10.1002/humu.23797
|View full text |Cite
|
Sign up to set email alerts
|

Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay

Abstract: The integrative analysis of high‐throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease‐associated human enhance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
59
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 52 publications
(63 citation statements)
references
References 34 publications
1
59
0
Order By: Relevance
“…We trained a separate model for the effects of variants in promoter sequences and in gene‐ distal sequences (Section 2). The assessors of this challenge (Shigaki et al, ) evaluated accuracy using Pearson correlation of the predicted labels (−1, 0, 1) with the continuous MPRA expression impact scores. They also calculated the AUROC treating this as a discretized classification task (e.g., 1 vs. [0 and −1]).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We trained a separate model for the effects of variants in promoter sequences and in gene‐ distal sequences (Section 2). The assessors of this challenge (Shigaki et al, ) evaluated accuracy using Pearson correlation of the predicted labels (−1, 0, 1) with the continuous MPRA expression impact scores. They also calculated the AUROC treating this as a discretized classification task (e.g., 1 vs. [0 and −1]).…”
Section: Resultsmentioning
confidence: 99%
“…We trained a separate model for the effects of variants in promoter sequences and in gene-distal sequences (Section 2). The assessors of this challenge (Shigaki et al, 2019) Table S7).…”
Section: Studying the Effects Of Small Genetic Variants On Mpra Outmentioning
confidence: 99%
“…We also found two other novel impact UTR variants in PLEC (the gene has an autosomal recessive inheritance pattern in OMIM), with an OMIM disease description of epidermolysis bullosa with pyloric atresia, related to the patient's fragile skin and food intolerance symptoms. As discussed above, present methods for identifying noncoding impact variants are not mature. Recent strong CAGI results for predicting which variants affect expression are encouraging in this regard (Shigaki et al, ), and it would be interesting to see how some of the more successful methods used there perform on the SickKids data. Nonstandard descriptions in clinical reports: an advantage of the Sickkids data is the use of HPO terms to describe patients’ symptoms (Girdea et al, ). That greatly facilitated the identification of candidate genes, and its broader adoption by other analysis centers will improve performance.…”
Section: Discussionmentioning
confidence: 99%
“…The Critical Assessment of Genome Interpretation 5 (CAGI5) consortium performed an MPRA of saturation-mutagenized human regulatory elements and disease-associated promoters in numerous cell types. A subset of these data were provided to analysts, who were then challenged to computationally predict functionality and effect sizes of heldout variants (58). The end project focused on identifying the most informative datasets (e.g., chromatin modification data from ENCODE) for cell-type specific regulatory variant prediction.…”
Section: Mpras Enable Identification Of Functional Regulators and Varmentioning
confidence: 99%