The objective in image co-segmentation is to jointly
segment unknown common objects from a given
set of images. In this paper, we propose a novel
deep convolution neural network based end-to-end
co-segmentation model. It is composed of a metric
learning and decision network leading to a novel
conditional siamese encoder-decoder network for
estimating a co-segmentation mask. The role of the
metric learning network is to find an optimum latent
feature space where objects of the same class
are closer and that of different classes are separated
by a certain margin. Depending on the
extracted features, the decision network decides
whether input images have common objects or not
and the encoder-decoder network produces a cosegmentation
mask accordingly. Key aspects of the
architecture are as follows. First, it is completely
class agnostic and does not require any semantic
information. Second, in addition to producing
masks, the decoder network also learns similarity
across image pairs that improves co-segmentation
significantly. Experimental results reflect an excellent
performance of our method compared to state of-the-art
methods on challenging co-segmentation
datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.