Training Interpretable Convolutional Neural Networks by Differentiating Class-Specific Filters

Liang, Haoyu; Ouyang, Zhihao; Zeng, Yuyuan; Su, Hang; He, Zihao; Xia, Shu-Tao; Zhu, Jun; Zhang, Bo

doi:10.1007/978-3-030-58536-5_37

Cited by 35 publications

(19 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is shown that neuron units generally extract features that can be interpreted as various levels of semantic concept, from textures and patterns to objects and scenes. Moreover, to learn interpretable neural networks, one option is to disentangle the representations learned by internal filters, which makes the filters more specialized [45,27]. Inspired by these works, we observe that in deep MDE networks, some hidden units are selective to some ranges of depth.…”

Section: Introductionmentioning

confidence: 80%

“…Moreover, other methods that share a similar concept to our method are to learn more specialized filters. In interpretable CNNs from [45], each filter represents a specific object part, while a more recent study [27] trains interpretable CNNs by alleviating filter-class entanglement, i.e. each filter only responds to one or few classes.…”

Section: Interpretable Deep Network For Visionmentioning

confidence: 99%

See 1 more Smart Citation

Towards Interpretable Deep Networks for Monocular Depth Estimation

You¹,

Tsai²,

Chiu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep networks for Monocular Depth Estimation (MDE) have achieved promising performance recently and it is of great importance to further understand the interpretability of these networks. Existing methods attempt to provide posthoc explanations by investigating visual cues, which may not explore the internal representations learned by deep networks. In this paper, we find that some hidden units of the network are selective to certain ranges of depth, and thus such behavior can be served as a way to interpret the internal representations. Based on our observations, we quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units. Moreover, we then propose a method to train interpretable MDE deep networks without changing their original architectures, by assigning a depth range for each unit to select. Experimental results demonstrate that our method is able to enhance the interpretability of deep MDE networks by largely improving the depth selectivity of their units, while not harming or even improving the depth estimation accuracy. We further provide comprehensive analysis to show the reliability of selective units, the applicability of our method on different layers, models, and datasets, and a demonstration on analysis of model error. Source code and models are available at https:// github.com/youzunzhi/InterpretableMDE.

show abstract

Section: Introductionmentioning

confidence: 80%

Section: Interpretable Deep Network For Visionmentioning

confidence: 99%

Towards Interpretable Deep Networks for Monocular Depth Estimation

You¹,

Tsai²,

Chiu³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Zhang et al [8] designed interpretable CNNs by making each filter represent a specific object part. Liang et al [23] trained interpretable CNNs by learning class-specific deep filters, namely, encouraging each filter only to account for few classes. Similarly, You et al [13] proposed to improve the depth selectivity by designing specific loss functions for MDE models.…”

Section: Interpretable and Explainable Deep Neural Networkmentioning

confidence: 99%

Disentangled Latent Transformer for Interpretable Monocular Height Estimation

Xiong¹,

Chen²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Monocular height estimation (MHE) from remote sensing imagery has high potential in generating 3D city models efficiently for a quick response to natural disasters. Most existing works pursue higher performance. However, there is little research exploring the interpretability of MHE networks. In this paper, we target at exploring how deep neural networks predict height from a single monocular image. Towards a comprehensive understanding of MHE networks, we propose to interpret them from multiple levels: 1) Neurons: unit-level dissection. Exploring the semantic and height selectivity of the learned internal deep representations; 2) Instances: object-level interpretation. Studying the effects of different semantic classes, scales and spatial contexts on height estimation; 3) Attribution: pixel-level analysis. Understanding which input pixels are important for the height estimation. Based on the multi-level interpretation, a disentangled latent Transformer network is proposed towards a more compact, reliable and explainable deep model for monocular height estimation. Furthermore, a novel unsupervised semantic segmentation task based on height estimation is first introduced in this work. Additionally, we also construct a new dataset for joint semantic segmentation and height estimation. Our work provides novel insights for both understanding and designing MHE models. The dataset and code are publicly available at https://github.com/ShadowXZT/DLT-Height-Estimation.pytorch.

show abstract

“…Providing human-intelligible explanations for the decisions of DCNNs is the ultimate goal of Explainable Artificial Intelligence (XAI) [9,10]. Existing work on interpretability mainly involves training interpretable machine learning [11,12] and explaining blackbox models [13][14][15]. Here, we mainly focus on the methods of explaining the black box model.…”

Section: Introductionmentioning

confidence: 99%

“…Figure11. For different interpretation algorithms under top-3% occlusion, the statistical results of object names that users can identify.…”

mentioning

confidence: 99%

Towards a Reliable Evaluation of Local Interpretation Methods

Lin

Wang

et al. 2021

Applied Sciences

View full text Add to dashboard Cite

The growing use of deep neural networks in critical applications is making interpretability urgently to be solved. Local interpretation methods are the most prevalent and accepted approach for understanding and interpreting deep neural networks. How to effectively evaluate the local interpretation methods is challenging. To address this question, a unified evaluation framework is proposed, which assesses local interpretation methods from three dimensions: accuracy, persuasibility and class discriminativeness. Specifically, in order to assess correctness, we designed an interactive user feature annotation tool to provide ground truth for local interpretation methods. To verify the usefulness of the interpretation method, we iteratively display part of the interpretation results, and then ask users whether they agree with the category information. At the same time, we designed and built a set of evaluation data sets with a rich hierarchical structure. Surprisingly, one finding is that the existing visual interpretation methods cannot satisfy all evaluation dimensions at the same time, and each has its own shortcomings.

show abstract

Training Interpretable Convolutional Neural Networks by Differentiating Class-Specific Filters

Cited by 35 publications

References 26 publications

Towards Interpretable Deep Networks for Monocular Depth Estimation

Towards Interpretable Deep Networks for Monocular Depth Estimation

Disentangled Latent Transformer for Interpretable Monocular Height Estimation

Towards a Reliable Evaluation of Local Interpretation Methods

Contact Info

Product

Resources

About