2018
DOI: 10.48550/arxiv.1805.06157
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Zero-Shot Object Detection by Hybrid Region Embedding

Abstract: Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(12 citation statements)
references
References 37 publications
0
12
0
Order By: Relevance
“…They also propose a generalization version of ZSD called generalized zero-shot object detection (GZSD) which aims to detect seen and unseen objects together. Demirel et al [6] adopt the hybrid region embedding to improve performance. Zhu et al [8] introduce ZS-YOLO, which is built on a one-step YOLOv2 [48] detector.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…They also propose a generalization version of ZSD called generalized zero-shot object detection (GZSD) which aims to detect seen and unseen objects together. Demirel et al [6] adopt the hybrid region embedding to improve performance. Zhu et al [8] introduce ZS-YOLO, which is built on a one-step YOLOv2 [48] detector.…”
Section: Related Workmentioning
confidence: 99%
“…Training Process. Compared with previous achievements [5,6,7,8] needing multi-step training and pre-trained weights on seen or unseen data, the training process of our model is very simple and convenient with a two step manner. Loss Function.…”
Section: Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…The authors in [3] incorporate an improved semantic mapping for the background in an iterative manner by first projecting the seen class visual features to their corresponding semantics and then the background bounding boxes to a set of diverse unseen semantic vectors. [4] learns an embedding space as a convex combination of training class wordvecs. [5] uses a Recurrent Neural Network to model natural language description of objects in the image.…”
Section: Related Workmentioning
confidence: 99%
“…ZSD is commonly accomplished by learning to project visual representations of different objects to a pre-defined semantic embedding space, and then performing nearest neighbor search in the semantic space at inference [2,3,4,5]. Since the unseen examples are never visualized during training, the model gets significantly biased towards the seen objects [6,7], leading to problems such as confusion with background and mode collapse resulting in high scores for only some unseen classes.…”
Section: Introductionmentioning
confidence: 99%