2020
DOI: 10.48550/arxiv.2012.12510
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Overcoming False Positives in Visual Relationship Detection

Daisheng Jin,
Xiao Ma,
Chongzhi Zhang
et al.

Abstract: In this paper, we investigate the cause of the high false positive rate in Visual Relationship Detection (VRD). We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e.g., the inaccurate object detection, which leads to the under-fitting of lowfrequency difficult proposals. This paper presents Spatially-Aware Balanced negative pRoposal sAmpling (SABRA), a robust VRD framework that alleviates the influence … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 36 publications
0
4
0
Order By: Relevance
“…VRD: Table 2 presents comparisons on VRD with eight state-of-the-art methods: VRD [3], KL distillation [23], Zoom-net [25], CAI + SCA-M [25], RelDN [37], AVR [42], GPS-Net [43], and SABRA [44]. For a fair comparison of VRD, we adopt the VGG-16 backbone pretrained on ImageNet used for these baselines to train our model.…”
Section: Compared Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…VRD: Table 2 presents comparisons on VRD with eight state-of-the-art methods: VRD [3], KL distillation [23], Zoom-net [25], CAI + SCA-M [25], RelDN [37], AVR [42], GPS-Net [43], and SABRA [44]. For a fair comparison of VRD, we adopt the VGG-16 backbone pretrained on ImageNet used for these baselines to train our model.…”
Section: Compared Resultsmentioning
confidence: 99%
“…Relation detection Phrase detection R@ 50 R@ 100 R@ 50 R@ 100 1 Baseline (20% noisy labels) 18.00 22. 44 > , making it more precise to parse the scene graph. It also can be observed that our model processed more complete detection of visual relationships; some triplets that are not in ground-truth are also precisely detected.…”
Section: K Methodsmentioning
confidence: 99%
“…Beyond that, there are several other noteworthy DL architectures for scene graph generation and visual relationship detection. Jin et al [197] proposed SABRA to classify all the negative triplet samples into five general types and use the Balanced Negative Proposal Sampling to get training samples with a balanced distribution. This approach controls the weight on different sample types, diminishing the sideeffect of the negative samples on prediction.…”
Section: Other Sgg Methodsmentioning
confidence: 99%
“…Recently, two novel techniques i.e., SABRA [197] and HET [195] have achieved SOTA performance for PhrDet and RelDet on VRD, respectively. SABRA enhanced the robustness of the training process of the proposed model by subdividing negative samples, while HET followed the intuitive perspective i.e., the more salient the object, the more important it would be for the scene graph.…”
Section: Quantitative Performancementioning
confidence: 99%