Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1651
|View full text |Cite
|
Sign up to set email alerts
|

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication

Abstract: In this work, we propose a goal-driven collaborative task that combines language, perception, and action. Specifically, we develop a Collaborative image-Drawing game between two agents, called CoDraw. Our game is grounded in a virtual world that contains movable clip art objects. The game involves two players: a Teller and a Drawer. The Teller sees an abstract scene containing multiple clip art pieces in a semantically meaningful configuration, while the Drawer tries to reconstruct the scene on an empty canvas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
60
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(61 citation statements)
references
References 45 publications
0
60
0
1
Order By: Relevance
“…It is shown with minor modifications, Text2Scene can generate cartoon like, semantic layout, and real image like scenes. Dialogue based interaction is studied to control image synthesis, in order to improve complex scene generation progressively [219]- [223]. Meanwhile, text-to-image synthesis is extended to multiple images or videos, where visual consistency is required among the generated images [224]- [226].…”
Section: ) Other Topicsmentioning
confidence: 99%
“…It is shown with minor modifications, Text2Scene can generate cartoon like, semantic layout, and real image like scenes. Dialogue based interaction is studied to control image synthesis, in order to improve complex scene generation progressively [219]- [223]. Meanwhile, text-to-image synthesis is extended to multiple images or videos, where visual consistency is required among the generated images [224]- [226].…”
Section: ) Other Topicsmentioning
confidence: 99%
“…For this task, we use the synthetic Collaborative Drawing (CoDraw) dataset [8], which is composed of sequences of images along with associated dialogue of instructions and linguistic feedback ( Figure 2). Also, we introduce the Iterative CLEVR (i-CLEVR) dataset (Figure 4), a modified version of the Compositional Language and Elementary Visual Reasoning (CLEVR) [9] dataset, for incremental construction of CLEVR scenes based on linguistic instructions.…”
Section: Geneva Task and Datasetsmentioning
confidence: 99%
“…The most similar task to GeNeVA is the task proposed by the CoDraw [8] authors. They require a model to build a scene by placing the clip art images of the individual objects in their correct positions.…”
Section: Geneva Task and Datasetsmentioning
confidence: 99%
See 2 more Smart Citations