2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.01128
|View full text |Cite
|
Sign up to set email alerts
|

Black-box Explanation of Object Detectors via Saliency Maps

Abstract: We propose D-RISE, a method for generating visual explanations for the predictions of object detectors. D-RISE can be considered "black-box" in the software testing sense, it only needs access to the inputs and outputs of an object detector. Compared to gradient-based methods, D-RISE is more general and agnostic to the particular type of object detector being tested as it does not need to know about the inner workings of the model. We show that D-RISE can be easily applied to different object detectors includi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
138
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 147 publications
(208 citation statements)
references
References 37 publications
0
138
0
Order By: Relevance
“…The size-constraint approach in (Kervadec et al, 2019b) achieved similar low performance because the use of the presence and non-presence constraints does not impose any upper bound on the size of the regions of interest, which, typically, results in the activation of large regions. (Wei et al, 2017) 7.50 65.60 25.01 CAM-Max (Oquab et al, 2015) 1.25 66.00 26.32 CAM-LSE (Pinheiro and Collobert, 2015;Sun et al, 2016) 1.25 66.05 27.93 Grad-CAM (Selvaraju et al, 2017) 0.00 66.30 21.30 CAM-Avg (Zhou et al, 2016) 0.00 66.90 17.88 Wildcat (Durand et al, 2017) 1 In terms of image classification performance (Table 1, first column), the proposed method obtains the lowest classification error, similarly to other methods such as CAM-avg (Zhou et al, 2016) and Grad-CAM (Selvaraju et al, 2017). It is noteworthy to mention that, despite providing the best segmentation results, U-Net cannot simultaneously provide image classification predictions.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The size-constraint approach in (Kervadec et al, 2019b) achieved similar low performance because the use of the presence and non-presence constraints does not impose any upper bound on the size of the regions of interest, which, typically, results in the activation of large regions. (Wei et al, 2017) 7.50 65.60 25.01 CAM-Max (Oquab et al, 2015) 1.25 66.00 26.32 CAM-LSE (Pinheiro and Collobert, 2015;Sun et al, 2016) 1.25 66.05 27.93 Grad-CAM (Selvaraju et al, 2017) 0.00 66.30 21.30 CAM-Avg (Zhou et al, 2016) 0.00 66.90 17.88 Wildcat (Durand et al, 2017) 1 In terms of image classification performance (Table 1, first column), the proposed method obtains the lowest classification error, similarly to other methods such as CAM-avg (Zhou et al, 2016) and Grad-CAM (Selvaraju et al, 2017). It is noteworthy to mention that, despite providing the best segmentation results, U-Net cannot simultaneously provide image classification predictions.…”
Section: Resultsmentioning
confidence: 99%
“…Pinpointing image sub-regions that were used by the model to make its global imageclass prediction not only provides weakly supervised segmentation, but also enables interpretable deep-network classifiers. It is worth noting that such interpretability aspects are also attracting wide interest in computer vision (Bach et al, 2015;Bau et al, 2017;Bhatt et al, 2020;Dabkowski and Gal, 2017;Escalante et al, 2018;Fong et al, 2019;Fong and Vedaldi, 2017;Goh et al, 2020;Osman et al, 2020;Murdoch et al, 2019;Petsiuk et al, 2020;2018;Ribeiro et al, 2016;Samek et al, 2020;Zhang et al, 2020;Belharbi et al, 2021) and medical imaging (de La Torre et al, 2020;Gondal et al, 2017;González-Gonzalo et al, 2020;Taly et al, 2019;Quellec et al, 2017;Keel et al, 2019;Wang et al, 2017). Deep learning classifiers are often considered as "black boxes" due to the lack of explanatory factors in their decisions.…”
Section: Introductionmentioning
confidence: 99%
“…RISE [24] averages random binary masks according to the model's output class probability for the masked inputs. This is extended in D-RISE [25] by a similarity metric allowing its application to detection models as well.…”
Section: Related Workmentioning
confidence: 99%
“…Inspired by perturbation approaches to generate saliency maps for image-based black-box models [24,25,56], we leverage the principle of analysis by occlusion. We propose OccAM: Occlusion-based Attribution Maps for 3D object detectors on LiDAR data.…”
Section: Introductionmentioning
confidence: 99%
“…Thirdly, the bias present in pre-trained models may propagate into the target task leading to an inadvertently biased target model. The deep networks exhibit different types of biases due to factors such as background, color, racial (Gwilliam et al (2021)), gender (Tang et al (2021); Zhao et al (2017)), contextual (Singh et al (2020)), co-occurrence (Petsiuk et al (2021)), spatial noise, dataset (Tommasi et al (2017)) and object-size (Nguyen et al (2020)). For instance, Petsiuk et al (2021) show that the object detectors are vulnerable to learning the co-occurrence of an unrelated adversarial marker.…”
Section: Introductionmentioning
confidence: 99%