Evaluating the Visualization of What a Deep Neural Network Has Learned

Samek, Wojciech; Binder, Alexander; Montavon, Grégoire; Lapuschkin, Sebastian; Müller, Klaus‐Robert

doi:10.1109/tnnls.2016.2599820

Cited by 942 publications

(751 citation statements)

References 27 publications

Supporting

Mentioning

745

Contrasting

Order By: Relevance

“…This makes the method particularly well suited to analyzing image classifiers, though the method has also been adapted for text and electroencephalogram signal classification [31]. Samek et al [32] have also developed an objective metric for comparing the output of LRP with similar heatmapping algorithms. Kumar et al [33] present an alternative heat-mapping method that can show the image regions that the model was most attentive to, but also allows for multiple classes to be associated with these regions of attention, whereas LRP assumes all features make either a zero or positive contribution to the single predicted class.…”

Section: B Model Functionalitymentioning

confidence: 99%

See 1 more Smart Citation

Interpretability of deep learning models: A survey of results

Chakraborty

Tomsett

Raghavendra

et al. 2017

2017 IEEE SmartWorld, Ubiquitous Intelligence &Amp; Computing, Advanced &Amp; Trusted Computed, Scalable Computing &Amp; Commun

297

189

View full text Add to dashboard Cite

Abstract-Deep neural networks have achieved near-human accuracy levels in various types of classification and prediction tasks including images, text, speech, and video data. However, the networks continue to be treated mostly as black-box function approximators, mapping a given input to a classification output. The next step in this human-machine evolutionary processincorporating these networks into mission critical processes such as medical diagnosis, planning and control -requires a level of trust association with the machine output.Typically, statistical metrics are used to quantify the uncertainty of an output. However, the notion of trust also depends on the visibility that a human has into the working of the machine. In other words, the neural network should provide humanunderstandable justifications for its output leading to insights about the inner workings. We call such models as interpretable deep networks.Interpretability is not a monolithic notion. In fact, the subjectivity of an interpretation, due to different levels of human understanding, implies that there must be a multitude of dimensions that together constitute interpretability. In addition, the interpretation itself can be provided either in terms of the lowlevel network parameters, or in terms of input features used by the model. In this paper, we outline some of the dimensions that are useful for model interpretability, and categorize prior work along those dimensions. In the process, we perform a gap analysis of what needs to be done to improve model interpretability.

show abstract

Section: B Model Functionalitymentioning

confidence: 99%

“…This means that explanations of the same type can be compared using a metric without need for any further context [32]. However, explanations of different types (saliency map images [13] and text captions for example [22]) can't be compared using a metric.…”

Section: B Interpretability Versus Explainabilitymentioning

confidence: 99%

Interpretability of deep learning models: A survey of results

Chakraborty

Tomsett

Raghavendra

et al. 2017

2017 IEEE SmartWorld, Ubiquitous Intelligence &Amp; Computing, Advanced &Amp; Trusted Computed, Scalable Computing &Amp; Commun

297

189

View full text Add to dashboard Cite

show abstract

“…Simonyan et al [11] compare this method with a form of activation maximization. In [12], the authors show sensitivity maps with evidence both for and against a particular class, while [13] develops heatmaps showing relevance or importance of image regions.…”

Section: Related Workmentioning

confidence: 99%

Visualization of feature evolution during convolutional neural network training

Punjabi

Katsaggelos

2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

Abstract-Convolutional neural networks (CNNs) are a staple in the fields of computer vision and image processing. These networks perform visual tasks with state-of-the-art accuracy; yet, the understanding behind the success of these algorithms is still lacking. In particular, the process by which CNNs learn effective task-specific features is still unclear. This work elucidates such phenomena by applying recent deep visualization techniques during different stages of the training process. Additionally, this investigation provides visual justification to the benefits of transfer learning. The results are in line with previously discussed notions of feature specificity, and show a new facet of a particularly vexing machine learning pitfall: overfitting.

show abstract

“…It is a principled method which has close relation to Taylor decomposition [11] and is applicable to arbitrary DNN architectures. From a practitioners perspective LRP adds a new dimension to the application of DNNs (e.g., in computer vision [12], [13]) by making the prediction transparent. Within the scope of cognitive neuroscience this means that DNN with LRP, may provide not only a highly effective (non-linear) classification technique that is suitable for complex high-dimensional data, but also yield detailed single-trial accounts of the distribution of decision-relevant information, a feature that is lacking in commonly applied DNN techniques and also in other state-of-the art methods (such as those discussed below).…”

Section: Introductionmentioning

confidence: 99%

Interpretable deep neural networks for single-trial EEG classification

Sturm

Lapuschkin

Samek

et al. 2016

Journal of Neuroscience Methods

Self Cite

340

235

View full text Add to dashboard Cite

Abstract-Background: In cognitive neuroscience the potential of Deep Neural Networks (DNNs) for solving complex classification tasks is yet to be fully exploited. The most limiting factor is that DNNs as notorious 'black boxes' do not provide insight into neurophysiological phenomena underlying a decision. Layerwise Relevance Propagation (LRP) has been introduced as a novel method to explain individual network decisions. New Method: We propose the application of DNNs with LRP for the first time for EEG data analysis. Through LRP the singletrial DNN decisions are transformed into heatmaps indicating each data point's relevance for the outcome of the decision. Results: DNN achieves classification accuracies comparable to those of CSP-LDA. In subjects with low performance subjectto-subject transfer of trained DNNs can improve the results. The single-trial LRP heatmaps reveal neurophysiologically plausible patterns, resembling CSP-derived scalp maps. Critically, while CSP patterns represent class-wise aggregated information, LRP heatmaps pinpoint neural patterns to single time points in single trials. Comparison with Existing Method(s):We compare the classification performance of DNNs to that of linear CSP-LDA on two data sets related to motor-imaginery BCI. Conclusion: We have demonstrated that DNN is a powerful nonlinear tool for EEG analysis. With LRP a new quality of highresolution assessment of neural activity can be reached. LRP is a potential remedy for the lack of interpretability of DNNs that has limited their utility in neuroscientific applications. The extreme specificity of the LRP-derived heatmaps opens up new avenues for investigating neural activity underlying complex perception or decision-related processes.

show abstract

Evaluating the Visualization of What a Deep Neural Network Has Learned

Cited by 942 publications

References 27 publications

Interpretability of deep learning models: A survey of results

Interpretability of deep learning models: A survey of results

Visualization of feature evolution during convolutional neural network training

Interpretable deep neural networks for single-trial EEG classification

Contact Info

Product

Resources

About