2020
DOI: 10.48550/arxiv.2012.05672
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Imitating Interactive Intelligence

Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, gro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(43 citation statements)
references
References 62 publications
0
42
0
Order By: Relevance
“…We also focus on one domain in order to compare with prior work which has argued that human-in-the-loop training is necessary. Consequently, the resulting agents are only designed to adaptively collaborate on a single task, and not to infer human preferences in general [1,32,58]. Moreover, if a task's reward function is poorly aligned with how humans approach the task, our method may well produce subpar partners, as would any method without access to human data.…”
Section: Discussionmentioning
confidence: 99%
“…We also focus on one domain in order to compare with prior work which has argued that human-in-the-loop training is necessary. Consequently, the resulting agents are only designed to adaptively collaborate on a single task, and not to infer human preferences in general [1,32,58]. Moreover, if a task's reward function is poorly aligned with how humans approach the task, our method may well produce subpar partners, as would any method without access to human data.…”
Section: Discussionmentioning
confidence: 99%
“…Our results are based on three procedurally generated video datasets of multi-object scenes. In increasing order of difficulty, they are: Objects Room 9 [42], CATER (moving camera) [43], and Playroom [44]. These were chosen to meet a number of criteria: we wanted at least 9-10 objects per scene (there could be fewer in view, or as many as 25 in the case of Playroom).…”
Section: Comparative Evaluationmentioning
confidence: 99%
“…The Playroom is a Unity-based environment for object-centric tasks [67,44], originally released as pre-packaged Docker containers with an Apache 2.0 license. We used an arbitrary behaviour policy (trained by demonstrations) to generate video sequences from the environment (one per episode).…”
Section: A23 Playroommentioning
confidence: 99%
“…In interpreting the effect of goal position, we note that, throughout the test phase, 25% of the puzzles used the same goal cell as the puzzles during the tutorial and practice phases, whereas when the goal position was changed, the goal cell would be any one of 16 possible cells with 1 32 probability or 64 cells with 1 256 probability. Thus, the persistent effect of goal position might result from a justifiable bias in attention toward the most common goal cell location, producing a small cost when attention must be deployed to a less likely position.…”
Section: Goal Positionmentioning
confidence: 99%
“…A further important extension will be to allow the simultaneous use of language to receive and generate instructions and explanations as partially implemented in [1]. This model could learn both by independently interacting with the environment as in traditional reinforcement learning, but also from a structured curriculum of mathematical problemsolving tasks requiring following instructions and receiving and producing explanations.…”
Section: The Path Forwardmentioning
confidence: 99%