2021
DOI: 10.1101/2021.10.05.463203
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DeepSTARR predicts enhancer activity from DNA sequence and enables thede novodesign of enhancers

Abstract: Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood and enhancer de novo design is considered impossible. Here we built a deep learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 117 publications
1
12
0
Order By: Relevance
“…Strikingly, we find that flanking nucleotide combinations can drive predictions by a factor of 3 relative to the AP-1’s core binding site with random flanks (Fig. 6b), similar to what was observed previously 45,46 . A position-weight-matrix-based approach 47 , which considers each position independently, would score many AP-1 binding sites the same, despite their wide spread in functional activity.…”
Section: Resultssupporting
confidence: 88%
“…Strikingly, we find that flanking nucleotide combinations can drive predictions by a factor of 3 relative to the AP-1’s core binding site with random flanks (Fig. 6b), similar to what was observed previously 45,46 . A position-weight-matrix-based approach 47 , which considers each position independently, would score many AP-1 binding sites the same, despite their wide spread in functional activity.…”
Section: Resultssupporting
confidence: 88%
“…Fig. S2: Cooperativity (residual fold change; y-axis) plotted as a function of distance (x-axis) between the motifs of the housekeeping TFs Dref (top row), Ohler1 (middle row), and Ohler6 (bottom row) for DeepSTARR 72 (left column) and ExplaiNN (right column). The 5-mer GGGCT is provided as a negative control (blue).…”
Section: Figuresmentioning
confidence: 99%
“…Additional mechanistic insight has been provided by thermodynamic modelling of enhancers 34,35 , in vivo imaging of enhancer activity 36 , the analysis of genetic variation through eQTL and caQTL analysis 6,37 , and high-throughput in vitro binding assays 38,39 . Recently, the enhancer biology field embraced the use of convolutional neural networks (CNN) and network-explainability techniques that again provided a significant leap forward in terms of prediction accuracy and syntax formulation 10,[40][41][42][43][44] . An orthogonal strategy to decode enhancer logic is to engineer synthetic enhancers from scratch.…”
Section: Introductionmentioning
confidence: 99%
“…This approach has the advantage that the designer knows exactly which features are implanted, so that the minimal requirements for enhancer function can be revealed. Recent work showed the promise of CNN-driven enhancer design by successfully designing yeast promoters 45 , and by using a CNN to select high-scoring enhancers for S2 cells, from a large pool of random sequences 42 . Here we tackle the next challenge in enhancer design, namely to design enhancers that are cell type specific.…”
Section: Introductionmentioning
confidence: 99%