2020
DOI: 10.1109/access.2020.2966647
|View full text |Cite
|
Sign up to set email alerts
|

Saliency Guided Self-Attention Network for Weakly and Semi-Supervised Semantic Segmentation

Abstract: Weakly supervised semantic segmentation (WSSS) using only image-level labels can greatly reduce the annotation cost and therefore has attracted considerable research interest. However, its performance is still inferior to the fully supervised counterparts. To mitigate the performance gap, we propose a saliency guided self-attention network (SGAN) to address the WSSS problem. The introduced self-attention mechanism is able to capture rich and extensive contextual information but also may mis-spread attentions t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
52
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 78 publications
(54 citation statements)
references
References 37 publications
2
52
0
Order By: Relevance
“…In particular, we observe that even our VGG-16-based model outperforms all previous methods trained using only ground truth classification labels, showing the effectiveness of our approach, and of image captions as supervision for this task. Our method also presents comparable results to those reported in [43] for SGAN, despite the fact that this model leverages much stronger supervision extensively in the form of class-agnostic saliency maps, generated by a model trained with pixel-level supervision.…”
Section: ) Ms-coco 2014supporting
confidence: 57%
See 4 more Smart Citations
“…In particular, we observe that even our VGG-16-based model outperforms all previous methods trained using only ground truth classification labels, showing the effectiveness of our approach, and of image captions as supervision for this task. Our method also presents comparable results to those reported in [43] for SGAN, despite the fact that this model leverages much stronger supervision extensively in the form of class-agnostic saliency maps, generated by a model trained with pixel-level supervision.…”
Section: ) Ms-coco 2014supporting
confidence: 57%
“…In this work, we focus on a different but complemen- [21], [25], [27], [38] labels - [24], [26], [39]- [43] labels tary problem: generating accurate segmentation masks using natural language captions instead of classification labels, which is not possible using these previous methods. Our method also introduces an effective way to take advantage of complementary labels for background categories and visual attributes for mask generation, which cannot be leveraged with existing systems.…”
Section: B Wsss Using Classification Labelsmentioning
confidence: 99%
See 3 more Smart Citations