2020
DOI: 10.1038/s41592-020-0907-8
|View full text |Cite
|
Sign up to set email alerts
|

Supervised enhancer prediction with epigenetic pattern recognition and targeted validation

Abstract: Enhancers are important noncoding elements, but they have been traditionally hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using Drosophila STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated our model could be tr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

13
110
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 90 publications
(125 citation statements)
references
References 57 publications
13
110
2
Order By: Relevance
“…In addition to the Registry of cCREs described in this report, one of the ENCODE companion papers developed a machine learning model that draws on the depth of ENCODE data in selected reference cell types to predict enhancers from self-transcribing active regulatory region sequencing (STARR-seq) data 52 . Another ENCODE companion paper expanded this model to connect cCREs with genes and thereby to construct large-scale regulatory networks that serve as a valuable resource for disease studies 38 .…”
Section: Other Approaches Using Machine Learningmentioning
confidence: 99%
“…In addition to the Registry of cCREs described in this report, one of the ENCODE companion papers developed a machine learning model that draws on the depth of ENCODE data in selected reference cell types to predict enhancers from self-transcribing active regulatory region sequencing (STARR-seq) data 52 . Another ENCODE companion paper expanded this model to connect cCREs with genes and thereby to construct large-scale regulatory networks that serve as a valuable resource for disease studies 38 .…”
Section: Other Approaches Using Machine Learningmentioning
confidence: 99%
“…This has important implications for methods derived from STARR-seq and methods based on quantification of self-transcripts. Those methods have become increasingly more important for global regulatory elements identification efforts of large consortia and especially for the new stage of ENCODE projects emphasizing on functional analysis 51 , and new prediction method 52 that had been developed based on enhancers identified by STARR-seq could also be affected.…”
Section: Discussionmentioning
confidence: 99%
“…Specifically, we downloaded the signal tracks for the five epigenetic features from the ENCODE portal, extracted the signals for each given region as input, and predicted for the existence of enhancers using our trained model. We compared our predictions with Matched-Filter, the leading method in the ENCODE enhancer challenge (Sethi et al, 2020).…”
Section: Decode Outperforms the Existing State-of-the-art Methods On Ementioning
confidence: 99%
“…Second, some methods employ supervised approaches to identify target regulatory elements using hypothetical enhancer loci or a limited number of validated enhancers, which are underpowered for training a reliable model for accurate prediction (Alipanahi et al ., 2015; Chen et al ., 2018; Li et al ., 2018; Lu et al ., 2015; Min et al ., 2017; Tang et al ., 2020). We recently developed a linear predictive model based on shape-matching filters from multiple epigenetic features trained from genome-scale STARR-seq experiments on Drosophila (Sethi et al ., 2020). However, most existing methods can only make binary predictions within a given region (>200 bp) and do not have a high enough resolution for more precise enhancer localization and boundary detection.…”
Section: Introductionmentioning
confidence: 99%