Purpose
As concept-based reasoning for improving model interpretability becomes promising, the question of how to define good concepts becomes more pertinent. In domains like medical, it is not always feasible to access instances clearly representing good concepts. In this work, we propose an approach to use organically mined concepts from unlabeled data to explain classifier predictions.
Methods
A Concept Mapping Module (CMM) is central to this approach. Given a capsule endoscopy image predicted as abnormal, the CMM’s main task is to identify which concept explains the abnormality. It consists of two parts, namely a convolutional encoder and a similarity block. The encoder maps the incoming image into the latent vector, while the similarity block retrieves the closest aligning concept as explanation.
Results
Abnormal images can be explained in terms of five pathology-related concepts retrieved from the latent space given by inflammation (mild and severe), vascularity, ulcer and polyp. Other non-pathological concepts found include anatomy, debris, intestinal fluid and capsule modality.
Conclusions
This method outlines an approach through which concept-based explanations can be generated. Exploiting the latent space of styleGAN to look for variations and using task-relevant variations for defining concepts is a powerful way through which an initial concept dictionary can be created which can subsequently be iteratively refined with much less time and resource.