Semantic object segmentation via detection in weakly labeled video

Chen, Xiaowu; Li, Jia; Wang, Chen; Xia, Changqun

doi:10.1109/cvpr.2015.7298987

Cited by 67 publications

(62 citation statements)

References 28 publications

(68 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the same setting of multiple foreground vs. single background, several methods have proposed to rely on additional supervision. For instance, [63] relied on the CPMC [8] region detector, which has been trained from pixel-level annotations, to segment foreground from background. In [58] and [15], object proposal methods trained from pixel-level and bounding box annotations, respectively, were employed.…”

Section: Roadmentioning

confidence: 99%

Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Saleh

Aliakbarian

Salzmann³

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Pixel-level annotations are expensive and timeconsuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recent years have seen great progress in weakly-supervised semantic segmentation, whether from a single image or from videos. However, most existing methods are designed to handle a single background class. In practical applications, such as autonomous navigation, it is often crucial to reason about multiple background classes. In this paper, we introduce an approach to doing so by making use of classifier heatmaps. We then develop a two-stream deep architecture that jointly leverages appearance and motion, and design a loss based on our heatmaps to train it. Our experiments demonstrate the benefits of our classifier heatmaps and of our two-stream architecture on challenging urban scene datasets and on the YouTube-Objects benchmark, where we obtain state-of-the-art results.

show abstract

Section: Roadmentioning

confidence: 99%

Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Saleh

Aliakbarian

Salzmann³

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…We validate our proposed method on the Youtube-Object-Dataset (Prest et al, 2012;Jain and Grauman, 2014;Zhang et al, 2015). In experiments our model greatly improved the pre-trained model by mitigating its drawback even when we do not use the weak labels.…”

Section: Introductionmentioning

confidence: 92%

Video semantic object segmentation by self-adaptation of DCNN

Park

Hong

2018

Pattern Recognition Letters

View full text Add to dashboard Cite

This paper proposes a new framework for semantic segmentation of objects in videos. We address the label inconsistency problem of deep convolutional neural networks (DCNNs) by exploiting the fact that videos have multiple frames; in a few frames the object is confidently-estimated (CE) and we use the information in them to improve labels of the other frames. Given the semantic segmentation results of each frame obtained from DCNN, we sample several CE frames to adapt the DCNN model to the input video by focusing on specific instances in the video rather than general objects in various circumstances. We propose offline and online approaches under different supervision levels. In experiments our method achieved great improvement over the original model and previous state-of-the-art methods.

show abstract

“…However, these methods rely on the quality of generated segment proposals and may produce inaccurate results when taking low-quality segments as the input. Zhang et al [44] propose to utilize object detectors integrated with object proposals to refine segmentations in videos. Furthermore, Tsai et al [40] develop a co-segmentation framework by linking object tracklets from all the videos and improve the result.…”

Section: Related Workmentioning

confidence: 99%

“…Compared to the baseline FCN model [20] used in our algorithm, there is a performance gain of 9%. In addition, while existing methods rely on training the segment classifier [34], integrating object proposals with detectors [44], co-segmentation via modeling relationships between videos [40], or self-paced fine-tuning [42], the proposed method utilizes a self-learning scheme to achieve better segmentation results. With the ResNet-101 architecture, we compare our method with DeepLab [2] and FSEG [12].…”

Section: Youtube-objects Datasetmentioning

confidence: 99%

Unseen Object Segmentation in Videos via Transferable Representations

Chen

Tsai²,

Yang

et al. 2019

Computer Vision – ACCV 2018

View full text Add to dashboard Cite

In order to learn object segmentation models in videos, conventional methods require a large amount of pixel-wise ground truth annotations. However, collecting such supervised data is time-consuming and labor-intensive. In this paper, we exploit existing annotations in source images and transfer such visual information to segment videos with unseen object categories. Without using any annotations in the target video, we propose a method to jointly mine useful segments and learn feature representations that better adapt to the target frames. The entire process is decomposed into two tasks: 1) solving a submodular function for selecting object-like segments, and 2) learning a CNN model with a transferable module for adapting seen categories in the source domain to the unseen target video. We present an iterative update scheme between two tasks to self-learn the final solution for object segmentation. Experimental results on numerous benchmark datasets show that the proposed method performs favorably against the state-of-the-art algorithms.

show abstract

Semantic object segmentation via detection in weakly labeled video

Cited by 67 publications

References 28 publications

Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Video semantic object segmentation by self-adaptation of DCNN

Unseen Object Segmentation in Videos via Transferable Representations

Contact Info

Product

Resources

About