Explaining black-box models such as deep neural networks is becoming increasingly important as it helps to boost trust and debugging. Popular forms of explanations map the features to a vector indicating their individual importance to a decision on the instance-level. They can then be used to prevent the model from learning the wrong bias in data possibly due to ambiguity. For instance, Ross et al.'s ``right for the right reasons'' propagates user explanations backwards to the network by formulating differentiable constraints based on input gradients.
Unfortunately, input gradients as well as many other widely used explanation methods form an approximation of the decision boundary and assume the underlying model to be fixed. Here, we demonstrate how to make use of influence functions---a well known robust statistic---in the constraints to correct the model’s behaviour more effectively. Our empirical evidence demonstrates that this ``right for better reasons''(RBR) considerably reduces the time to correct the classifier at training time and boosts the quality of explanations at inference time compared to input gradients. Besides, we also showcase the effectiveness of RBR in correcting "Clever Hans"-like behaviour in real, high-dimensional domain.
The goal of combining the robustness of neural networks and the expressivity of symbolic methods has rekindled the interest in Neuro-Symbolic AI. One specifically interesting branch of research is deep probabilistic programming languages (DPPLs) which carry out probabilistic logical programming via the probability estimations of deep neural networks. However, recent SOTA DPPL approaches allow only for limited conditional probabilistic queries and do not offer the power of true joint probability estimation. In our work, we propose an easy integration of tractable probabilistic inference within a DPPL. To this end we introduce SLASH, a novel DPPL that consists of Neural-Probabilistic Predicates (NPPs) and a logical program, united via answer set programming. NPPs are a novel design principle allowing for the unification of all deep model types and combinations thereof to be represented as a single probabilistic predicate. In this context, we introduce a novel +/- notation for answering various types of probabilistic queries by adjusting the atom notations of a predicate. We evaluate SLASH on the benchmark task of MNIST addition as well as novel tasks for DPPLs such as missing data prediction, generative learning and set prediction with state-of-the-art performance, thereby showing the effectiveness and generality of our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.