Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

Zhu, Lei; Chen, Qian; Jin, Lujia; You, Yunfei; Lu, Yanye

doi:10.1007/978-3-031-20080-9_11

Cited by 14 publications

(4 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table 2 shows the performance of our method and the state-of-the-art weakly supervised object localization methods, including FAM [33], CREAM [49], BGC [24], BAS [47], DAOL [61], BagCAM [60], TS-CAM [15], LCTR [10], ViTOL [18], and SCM [2] on CUB-200-2011 and ImageNet-1K datasets. We also compare our method with PSOL [53] and C 2 AM [48] which are self-supervised methods.…”

Section: Resultsmentioning

confidence: 99%

“…We present the results of both supervised and self-supervised pre-trained models in Table 10. Interestingly, we found that [60] ResNet50 84.88 69.97 ViTOL '22 [18] ViT-S 73.17 70.47 SCM '22 [2] ViT-S 89.90 -Self-supervised methods C 2 AM '22 [48] ResNet50 the results of the supervised model were inferior to those of the self-supervised model. This disparity can be linked to a wellestablished challenge in object localization with class-level supervision [13,27,46].…”

Section: Advantage Of Ssl Pre-trained Backbonementioning

confidence: 86%

“…GT-known Loc computes the ratio of the images where the Intersection over Union (IoU) between the estimated bounding box and the known ground-truth boxes is greater than 50%. Top-1 Loc and Top-5 Loc compute the ratio of the images correctly classified with Top-1 and Top- [60] 74.51 90.38 52.17 67.68 Self-supervised methods C 2 AM '22 [48] 69…”

Section: Datasets and Evaluation Metricsmentioning

confidence: 99%

See 2 more Smart Citations

A Task of Convergent AI Ethics Education in School Curriculum - with emphasis on the major of AI Convergent Education in graduate schools of education and the class of ‘AI Ethics’ -

Song¹

2022

Journal of Ethics Education Studies

View full text Add to dashboard Cite

We propose a novel unsupervised object localization method that allows us to explain the predictions of the model by utilizing self-supervised pre-trained models without additional finetuning. Existing unsupervised and selfsupervised object localization methods often utilize classagnostic activation maps or self-similarity maps of a pretrained model. Although these maps can offer valuable information for localization, their limited ability to explain how the model makes predictions remains challenging. In this paper, we propose a simple yet effective unsupervised object localization method based on representer point selection, where the predictions of the model can be represented as a linear combination of representer values of training points. By selecting representer points, which are the most important examples for the model predictions, our model can provide insights into how the model predicts the foreground object by providing relevant examples as well as their importance. Our method outperforms the state-ofthe-art unsupervised and self-supervised object localization methods on various datasets with significant margins and even outperforms recent weakly supervised and few-shot methods. Our code is available at: https://github. com/yeonghwansong/UOLwRPS

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Advantage Of Ssl Pre-trained Backbonementioning

confidence: 86%

Section: Datasets and Evaluation Metricsmentioning

confidence: 99%

See 1 more Smart Citation

A Task of Convergent AI Ethics Education in School Curriculum - with emphasis on the major of AI Convergent Education in graduate schools of education and the class of ‘AI Ethics’ -

Song¹

2022

Journal of Ethics Education Studies

View full text Add to dashboard Cite

show abstract

“…This method is also cost-intensive as CAMs must be averaged from different classes. To overcome this limitation, [43,44] proposes a method to suppress the background regions to help the network identify foreground regions with high confidence.…”

Section: Related Workmentioning

confidence: 99%

Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Murtaza¹,

Belharbi²,

Pedersoli³

et al. 2023

2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

View full text Add to dashboard Cite

Self-supervised vision transformers (SSTs) have shown great potential to yield rich localization maps that highlight different objects in an image. However, these maps remain class-agnostic since the model is unsupervised. They often tend to decompose the image into multiple maps containing different objects while being unable to distinguish the object of interest from background noise objects. In this paper, Discriminative Pseudo-label Sampling (DiPS) is introduced to leverage these class-agnostic maps for weakly-supervised object localization (WSOL), where only imageclass labels are available. Given multiple attention maps, DiPS relies on a pre-trained classifier to identify the most discriminative regions of each attention map. This ensures that the selected ROIs cover the correct image object while discarding the background ones, and, as such, provides a rich pool of diverse and discriminative proposals to cover different parts of the object. Subsequently, these proposals are used as pseudo-labels to train our new transformer-based WSOL model designed to perform classification and localization tasks. Unlike standard WSOL methods, DiPS optimizes performance in both tasks by using a transformer encoder and a dedicated output head for each task, each trained using dedicated loss functions. To avoid overfitting a single proposal and promote better object coverage, a single proposal is randomly selected among the top ones for a training image at each training step. Experimental results 1 on the challenging CUB, ILSVRC, OpenImages, and TelDrone datasets indicate that our architecture, in combination with our transformer-based proposals, can yield better localization performance than state-of-the-art methods.

show abstract

Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization

Yang,

Duan,

Wang

et al. 2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

Cited by 14 publications

References 35 publications

A Task of Convergent AI Ethics Education in School Curriculum - with emphasis on the major of AI Convergent Education in graduate schools of education and the class of ‘AI Ethics’ -

A Task of Convergent AI Ethics Education in School Curriculum - with emphasis on the major of AI Convergent Education in graduate schools of education and the class of ‘AI Ethics’ -

Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization

Contact Info

Product

Resources

About