2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) 2023
DOI: 10.1109/wacvw58289.2023.00021
|View full text |Cite
|
Sign up to set email alerts
|

Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Abstract: Self-supervised vision transformers (SSTs) have shown great potential to yield rich localization maps that highlight different objects in an image. However, these maps remain class-agnostic since the model is unsupervised. They often tend to decompose the image into multiple maps containing different objects while being unable to distinguish the object of interest from background noise objects. In this paper, Discriminative Pseudo-label Sampling (DiPS) is introduced to leverage these class-agnostic maps for we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 60 publications
0
1
0
Order By: Relevance
“…The authors propose a stochastic sampling of local evidence as opposed to common practice in the literature, where pseudolabels are selected and fixed before training. The F-CAM method was further adapted for transformer-based methods (Murtaza et al, 2023(Murtaza et al, , 2022 for WASOL in drone-surveillance, and subsequently, for WSOL in videos (Belharbi et al, 2023). Following F-CAM architecture, NEGEV was proposed for histology data to improve localization and classifier interpretability.…”
Section: Cam Refinement Methodsmentioning
confidence: 99%
“…The authors propose a stochastic sampling of local evidence as opposed to common practice in the literature, where pseudolabels are selected and fixed before training. The F-CAM method was further adapted for transformer-based methods (Murtaza et al, 2023(Murtaza et al, , 2022 for WASOL in drone-surveillance, and subsequently, for WSOL in videos (Belharbi et al, 2023). Following F-CAM architecture, NEGEV was proposed for histology data to improve localization and classifier interpretability.…”
Section: Cam Refinement Methodsmentioning
confidence: 99%