2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.232
|View full text |Cite
|
Sign up to set email alerts
|

Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Abstract: Pixel-level annotations are expensive and timeconsuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recent years have seen great progress in weakly-supervised semantic segmentation, whether from a single image or from videos. However, most existing methods are designed to handle a single background class. In practical applications, such as autonomous navigation, it is often crucial to reason about multiple background classes. In this paper,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
22
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 35 publications
(23 citation statements)
references
References 66 publications
(154 reference statements)
1
22
0
Order By: Relevance
“…The results on CamVid in Table 4, where we compare our method to fullysupervised techniques that make use of CamVid images and annotations to train a model, GTA5-based baselines, and the state-of-the-art weakly-supervised method, show a similar trend. Our approach clearly outperforms the weaklysupervised method of [39] and a DeepLab semantic segmentation network trained on synthetic data. In fact, on this dataset, it event outperforms some of the fully supervised methods that rely on annotated CamVid images for training.…”
Section: Resultsmentioning
confidence: 90%
See 2 more Smart Citations
“…The results on CamVid in Table 4, where we compare our method to fullysupervised techniques that make use of CamVid images and annotations to train a model, GTA5-based baselines, and the state-of-the-art weakly-supervised method, show a similar trend. Our approach clearly outperforms the weaklysupervised method of [39] and a DeepLab semantic segmentation network trained on synthetic data. In fact, on this dataset, it event outperforms some of the fully supervised methods that rely on annotated CamVid images for training.…”
Section: Resultsmentioning
confidence: 90%
“…In Table 3, we compare our approach with the state-of-the-are weakly-supervised method of [39] and with state-of-the-art domain adaptation methods. The results for these methods were directly taken from their respective papers.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, most existing works are dedicated to handle multiple salient foreground instances and evaluate on the Pascal VOC dataset [12]. [36] is the only existing work that considers complete scene parsing (background + foreground) with only imagelevel label by leveraging two-stream deep architecture and heat map loss. However, their result does not perform well compared to other adaptation methods on the Cityscape dataset.…”
Section: Weak Image-level Supervisionmentioning
confidence: 99%
“…Recently, weakly-supervised methods for video object segmentation [40,42,31,41] have been developed to relax the need for annotations where only class-level labels are required. These approaches have significantly reduced the labor-intensive step of collecting pixel-wise training data on target categories.…”
Section: Introductionmentioning
confidence: 99%