2021
DOI: 10.48550/arxiv.2103.04523
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

Abstract: Weakly supervised object localization (WSOL) remains an open problem given the deficiency of finding object extent information using a classification network. Although prior works struggled to localize objects through various spatial regularization strategies, we argue that how to extract object structural information from the trained classification network is neglected. In this paper, we propose a two-stage approach, termed structure-preserving activation (SPA), toward fully leveraging the structure informati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…Contextual information [19], attention mechanism [43], gradient map [40] and semantic segmentation [48] are leveraged to learn accurate object proposals. CAM-based methods [33,41,41,55,57] produce localization maps by aggregating deep feature maps using a class-specific fully connected layer. Despite the simplicity and effectiveness of CAM-based methods, they suffer from identifying small discriminative parts of objects.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Contextual information [19], attention mechanism [43], gradient map [40] and semantic segmentation [48] are leveraged to learn accurate object proposals. CAM-based methods [33,41,41,55,57] produce localization maps by aggregating deep feature maps using a class-specific fully connected layer. Despite the simplicity and effectiveness of CAM-based methods, they suffer from identifying small discriminative parts of objects.…”
Section: Related Workmentioning
confidence: 99%
“…SPG [56] and I 2 C [57] increased the quality of localization maps by introducing the constraint of pixel-level correlations into the network. SPA [33] and TS-CAM [13] obtain accurate localization maps with the help of long-range structural information. Different from the WSOD task, our task leverages the correctly attended regions for caption generation.…”
Section: Related Workmentioning
confidence: 99%