We propose a method for semantic image segmentation, combining a deep neural network and spatial relationships between image regions, encoded in a graph representation of the scene. Our proposal is based on inexact graph matching, formulated as a quadratic assignment problem applied to the output of the neural network. The proposed method is evaluated on a public dataset used for segmentation of images of faces, and compared to the U-Net deep neural network that is widely used for semantic segmentation. Preliminary results show that our approach is promising. In terms of Intersection-over-Union of region bounding boxes, the improvement is of 2.4% in average, compared to U-Net, and up to 24.4% for some regions. Further improvements are observed when reducing the size of the training dataset (up to 8.5% in average).
The paper addresses the fundamental task of semantic image analysis by exploiting structural information (spatial relationships between image regions). We propose to combine a deep neural network (CNN) with graph matching where graphs encode efficiently structural information related to regions segmented by the CNN. Our novel approach solves the quadratic assignment problem (QAP) sequentially for matching graphs. The optimal sequence for graph matching is conveniently defined using reinforcement-learning (RL) based on the region membership probabilities produced by the CNN and their structural relationships. Our RL based strategy for solving QAP sequentially allows us to significantly reduce the combinatorial complexity for graph matching. Preliminary experiments are performed on both a synthetic dataset and a public dataset dedicated to the semantic segmentation of face images. Results show that the proposed RL-based ordering significantly outperforms random ordering, and that our strategy is about 386 times faster than a global QAP-based approach, while preserving similar segmentation accuracy.
Deep learning based pipelines for semantic segmentation often ignore structural information available on annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about the objects of interest to improve segmentation results provided by deep learning. This module corresponds to a "manyto-one-or-none" inexact graph matching approach, and is formulated as a quadratic assignment problem. Using two standard measures for evaluation, we show experimentally that our pipeline for segmentation of 3D MRI data of the brain outperforms the baseline CNN (U-Net) used alone. In addition, our approach is shown to be resilient to small training datasets that often limit the performance of deep learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.