2021
DOI: 10.1038/s42256-020-00291-x
|View full text |Cite
|
Sign up to set email alerts
|

Improving representations of genomic sequence motifs in convolutional networks with exponential activations

Abstract: Deep convolutional neural networks (CNNs) trained on regulatory genomic sequences tend to build representations in a distributed manner, making it a challenge to extract learned features that are biologically meaningful, such as sequence motifs.Here we perform a comprehensive analysis on synthetic sequences to investigate the role that CNN activations have on model interpretability. We show that employing an exponential activation to first layer filters consistently leads to interpretable and robust representa… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
36
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 63 publications
(37 citation statements)
references
References 44 publications
1
36
0
Order By: Relevance
“…Another potential trend is building DNNs using biophysical (Tareen and Kinney, 2019) or physicochemical properties (Yang et al, 2017;Liu et al, 2020), as deep models trained on these features might uncover novel patterns in data and lead to improved understanding of the physicochemical principles of protein-nucleic acid regulatory interactions, as well as aid model interpretability. Other novel approaches include: 1) modifying DNN properties to improve recovery of biologically meaningful motif representations (Koo and Ploenzke, 2021), 2) transformer networks (Devlin et al, 2018) and attention mechanisms (Vaswani et al, 2017), widely used in protein sequence modeling (Jurtz et al, 2017;Rao et al, 2019;Vig et al, 2020;Repecka et al, 2021), 3) graph convolutional neural networks, a class of DNNs that can work directly on graphs and take advantage of their structural information, with the potential to give us great insights if we can reframe genomics problems as graphs (Cranmer et al, 2020;Strokach et al, 2020), and 4) generative modeling (Foster, 2019), which may help exploit current knowledge in designing synthetic sequences with desired properties (Killoran et al, 2017;Wang Y. et al, 2020). With the latter, unsupervized training is used with approaches including: 1) autoencoders, which learn efficient representations of the training data, typically for dimensionality reduction (Way and Greene, 2018) or feature selection (Xie et al, 2017), 2) generative adversarial networks, which learn to generate new data with the same statistics as the training set (Wang Y. et al, 2020;Repecka et al, 2021), and 3) deep belief networks, which learn to probabilistically reconstruct their inputs, acting as feature detectors, and can be further trained with supervision to build efficient classifiers (Bu et al, 2017).…”
Section: Advantagesmentioning
confidence: 99%
“…Another potential trend is building DNNs using biophysical (Tareen and Kinney, 2019) or physicochemical properties (Yang et al, 2017;Liu et al, 2020), as deep models trained on these features might uncover novel patterns in data and lead to improved understanding of the physicochemical principles of protein-nucleic acid regulatory interactions, as well as aid model interpretability. Other novel approaches include: 1) modifying DNN properties to improve recovery of biologically meaningful motif representations (Koo and Ploenzke, 2021), 2) transformer networks (Devlin et al, 2018) and attention mechanisms (Vaswani et al, 2017), widely used in protein sequence modeling (Jurtz et al, 2017;Rao et al, 2019;Vig et al, 2020;Repecka et al, 2021), 3) graph convolutional neural networks, a class of DNNs that can work directly on graphs and take advantage of their structural information, with the potential to give us great insights if we can reframe genomics problems as graphs (Cranmer et al, 2020;Strokach et al, 2020), and 4) generative modeling (Foster, 2019), which may help exploit current knowledge in designing synthetic sequences with desired properties (Killoran et al, 2017;Wang Y. et al, 2020). With the latter, unsupervized training is used with approaches including: 1) autoencoders, which learn efficient representations of the training data, typically for dimensionality reduction (Way and Greene, 2018) or feature selection (Xie et al, 2017), 2) generative adversarial networks, which learn to generate new data with the same statistics as the training set (Wang Y. et al, 2020;Repecka et al, 2021), and 3) deep belief networks, which learn to probabilistically reconstruct their inputs, acting as feature detectors, and can be further trained with supervision to build efficient classifiers (Bu et al, 2017).…”
Section: Advantagesmentioning
confidence: 99%
“…Inferring promoter motifs from convolutional kernels. We inferred promoter motifs learned by each trained model by examining the 256 kernels in the first convolutional layer, which capture such information 23 . For each kernel x, denoted by Conv1dx, we 590 generated a feature map Fx of dimension N × 5 × 1000 as the output of processing all N unique one-hot-encoded 1000-bp promoter sequences (P 1 , .…”
Section: Interpreting the Convolutional Kernelsmentioning
confidence: 99%
“…To gain insights into what DNN-based methods have learned, DLPRB visualizes filter representations while cDeepbind employs in silico mutagenesis. Filter representations are sensitive to network design choices [29,30]; ResidualBind is not designed with the intention of learning interpretable filters. Hence, we opted to employ in silico mutagenesis, which systematically probes the effect size that each possible single nucleotide mutation in a given sequence has on model predictions.…”
Section: Going Beyond In Silico Mutagenesis With Giamentioning
confidence: 99%
“…For RBPs, this has been accomplished by visualizing first convolutional layer filters and via attribution methods [13,18,23,24]. First layer filters have been shown to capture motif-like representations, but their efficacy depends highly on choice of model architecture [29], activation function [30], and training procedure [31]. First-order attribution methods, including in silico mutagenesis [13,32] and other gradient-based methods [19,[33][34][35][36], are interpretability methods that identify the independent importance of single nucleotide variants in a given sequence toward model predictions-not the effect size of extended patterns such as sequence motifs.…”
Section: Introductionmentioning
confidence: 99%