2019
DOI: 10.1101/518191
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep neural networks for interpreting RNA binding protein target preferences

Abstract: Deep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP bindi… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(22 citation statements)
references
References 37 publications
0
22
0
Order By: Relevance
“…Our work also demonstrates that the removal of border artefacts is crucial for an end-to-end learning system to learn non-trivial protein binding hypothesis. An interesting alternative to our de-biasing technique is the one introduced by Ghanbari and Ohler (2019), who formulate the RBP binding prediction problem as a multi-class classification, aiming to simultaneously predict the binding of all RBPs in a collection of CLIP-seq data sets, which combines multiple biased RBP dataset into one. If all datasets are equally affected by sequence biases introduced by the experimental protocol, then this bias is uninformative for the prediction task and should not significantly affect the training.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Our work also demonstrates that the removal of border artefacts is crucial for an end-to-end learning system to learn non-trivial protein binding hypothesis. An interesting alternative to our de-biasing technique is the one introduced by Ghanbari and Ohler (2019), who formulate the RBP binding prediction problem as a multi-class classification, aiming to simultaneously predict the binding of all RBPs in a collection of CLIP-seq data sets, which combines multiple biased RBP dataset into one. If all datasets are equally affected by sequence biases introduced by the experimental protocol, then this bias is uninformative for the prediction task and should not significantly affect the training.…”
Section: Discussionmentioning
confidence: 99%
“…Integrated gradients (Sundararajan et al, 2017) are an effective approach to assign an "attribution score" Attr(i) to each position i of a given input sequence s, measuring the extent to which the nucleotide at that position contributes to the entire sequence's prediction score. Figure 2: Positive examples from the PAR-CLIP dataset can be biased at the beginning and at the end of the viewpoint regions, emitting an unusually high frequency of Guanine (Ghanbari and Ohler, 2019) and some correlated residuals, which can be revealed by aligning the viewpoint borders of all positive examples. Such pattern, however, does not seem to exist in the negative examples, e.g.…”
Section: Sequence and Secondary Structure Motif Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…Models that exploit the latter may not necessarily generalize well, Currently, the main approach to interpret a convolutional neural network (CNN) is to visualize learned representations in the input space. In genomics, such methods include visualizing the convolutional filters (Alipanahi et al, 2015;Kelley et al, 2016;Quang & Xie, 2016;Angermueller et al, 2016;Cuperus et al, 2017;Chen et al, 2018;Ben-Bassat et al, 2018;Wang et al, 2018), attribution methods (Alipanahi et al, 2015;Zhou & Troyanskaya, 2015;Kelley et al, 2016;Shrikumar et al, 2017;Ghanbari & Ohler, 2019), and more recently in silico experiments (Koo et al, 2018;Avsec et al, 2019). These approaches can be grouped into local and global interpretability methods.…”
Section: Overviewmentioning
confidence: 99%
“…For example, gradients (from predictions to the inputs) have been employed to reveal known transcription factor (TF) binding sites when trained to predict read profiles from high-throughput sequencing datasets (Kelley et al, 2018). Integrated gradients were used to uncover motifs for RNAprotein interactions (Ghanbari & Ohler, 2019). Recently, DeepLift was used to uncover known and novel TF binding sites, including their syntax with respect to other binding sites .…”
Section: Local Interpretabilitymentioning
confidence: 99%