A rich line of research attempts to make deep neural networks more transparent by generating human-interpretable 'explanations' of their decision process, especially for interactive tasks like Visual Question Answering (VQA). In this work, we analyze if existing explanations indeed make a VQA model -its responses as well as failures -more predictable to a human. Surprisingly, we find that they do not. On the other hand, we find that humanin-the-loop approaches that treat the model as a black-box do.Sida I Wang, Percy Liang, and Christopher D Manning.2016. Learning language games through interaction. arXiv preprint arXiv:1606.02447.
In this paper, we make a simple observation that questions about images often contain premises -objects and relationships implied by the question -and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions.When presented with a question that is irrelevant to an image, state-of-the-art VQA models will still answer purely based on learned language biases, resulting in nonsensical or even misleading answers. We note that a visual question is irrelevant to an image if at least one of its premises is false (i.e. not depicted in the image). We leverage this observation to construct a dataset for Question Relevance Prediction and Explanation (QRPE) by searching for false premises. We train novel question relevance detection models and show that models that reason about premises consistently outperform models that do not.We also find that forcing standard VQA models to reason about premises during training can lead to improvements on tasks requiring compositional reasoning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.