2021
DOI: 10.1093/nar/gkab765
|View full text |Cite
|
Sign up to set email alerts
|

Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks

Abstract: Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting non-coding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
5
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 65 publications
0
5
0
Order By: Relevance
“…Many computational approaches have sought to predict enhancer activities from DNA sequences using local DNA features, e.g. motif dictionaries or de-novo k-mers, and selected syntax rules in various thermodynamic or machine-learning frameworks 16,17,26,[28][29][30][31][32][33][34][35][36][37][38][39] . Despite remarkable success, these approaches did not reveal how the elements of motif syntax collaborate to determine enhancer activity.…”
mentioning
confidence: 99%
“…Many computational approaches have sought to predict enhancer activities from DNA sequences using local DNA features, e.g. motif dictionaries or de-novo k-mers, and selected syntax rules in various thermodynamic or machine-learning frameworks 16,17,26,[28][29][30][31][32][33][34][35][36][37][38][39] . Despite remarkable success, these approaches did not reveal how the elements of motif syntax collaborate to determine enhancer activity.…”
mentioning
confidence: 99%
“…The motif aggregator module deals with the effect of a combination of TF binding sites (of varying strengths) on gene regulation, using a simple weighted sum to capture this combined effect. A TF's regulatory influence depends on two factors (in addition to binding site strengths)its concentration, which affects its occupancy, and regulatory "potency", i.e., activation or repression strength of a bound TF (see Dibaeinia and Sinha, 2021). The weight learnt for each TF motif by the aggregator module combines these two factors into a single weight.…”
Section: Discussionmentioning
confidence: 99%
“…Moreover, the internal parameters of the model, such as trainable pooling or interaction attention, are also directly interpretable and can reveal additional information about mechanisms by which highly parameterized models produce high accuracy predictions. Our approach is inspired by related work that encodes well characterized biochemical systems as neural network functions and optimizes their biochemically interpretable parameters with respect to observed data (Dibaeinia and Sinha, 2021;Liu et al, 2020;Tareen and Kinney, 2019). tiSFM has several distinct contributions: (1): tiSFM is more performant than current state-of-the-art models in the field (Maslova et al, 2020) (2): tiSFM separates itself from other works by building a "totally" interpretable architecture that is amendable to transparent analysis at each layer, distinct from previous works (Quang and Xie, 2016;Banovich et al, 2017;Liu et al, 2020) that only rely on initial convolution from PWMs, (3): In our framework, the programmed interpretability in tiSFM allows us to offer more extensive homotypic and heterotypic TF-TF interaction insights that are not offered or possible in previous works (Maslova et al, 2020) (Fig.…”
Section: Introductionmentioning
confidence: 99%
“…While such genomic readouts are believed to be largely determined by DNA sequence, the precise sequence-to-function (S2F) relation is complex and remains poorly understood. Nevertheless, recent developments have shown that by using deep learning models, with millions of parameters, it is indeed possible to learn S2F mappings, predict epigenetic readouts, or even characterize gene expression (Maslova et al ., 2020; Avsec et al ., 2021b; Kelley et al ., 2016; Zhou and Troyanskaya, 2015; Quang and Xie, 2016; Dibaeinia and Sinha, 2021; Avsec et al ., 2021a). Given these successes, it is worthwhile to ask what scientific insights that go beyond advancing our understanding of applying machine learning engineering principles to genomics data such models can provide.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation