Proceedings of the 11th International Conference on Natural Language Generation 2018
DOI: 10.18653/v1/w18-6516
|View full text |Cite
|
Sign up to set email alerts
|

SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects

Abstract: We present SpatialVOC2K, the first multilingual image dataset with spatial relation annotations and object features for imageto-text generation, built using 2,026 images from the PASCAL VOC2008 dataset. The dataset incorporates (i) the labelled object bounding boxes from VOC2008, (ii) geometrical, language and depth features for each object, and (iii) for each pair of objects in both orders, (a) the single best preposition and (b) the set of possible prepositions in the given language that describe the spatial… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
4

Relationship

3
5

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…SpatialVOC2K [158] is the first multilingual image dataset with spatial relation annotations and object features for image-to-text generation. It consists of all 2,026 images…”
Section: D Datasetsmentioning
confidence: 99%
“…SpatialVOC2K [158] is the first multilingual image dataset with spatial relation annotations and object features for image-to-text generation. It consists of all 2,026 images…”
Section: D Datasetsmentioning
confidence: 99%
“…The first dataset used is one containing spatial relations in images [9,10,11], consisting of 20 different objects and 17 different target values. It is split into two parts, of which the entitled best part is used, containing 5317 examples, further subdivided into five stratified folds.…”
Section: Spatialvoc2kmentioning
confidence: 99%
“…Two datasets are used to evaluate the paradigms against a baseline. The first is a real-world dataset of spatial relations in images, SpatialVOC2K [9,10,11]. The second is a synthetic dataset consisting of nine clusters, which can be seen in Figure 1, used to experiment freely with its characteristics.…”
Section: Introductionmentioning
confidence: 99%
“…The SpatialVOC2K (Belz et al, 2018) dataset is used to train and test the pattern recognition models. This dataset consists of 2,026 images with object labels, bounding boxes annotations extracted from the PAS-CAL VOC2008 challenge dataset (Everingham et al, 2007), to which relations between objects and depth values were added (Belz et al, 2018).…”
Section: Datasetmentioning
confidence: 99%