2022
DOI: 10.1109/lra.2022.3193254
|View full text |Cite
|
Sign up to set email alerts
|

DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Abstract: We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus opening a new venue for highefficiency HRI data collection and EAI system evaluation. Along with the platform, we introd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(29 citation statements)
references
References 47 publications
0
29
0
Order By: Relevance
“…In Lynch et al (2022) the authors present a RL and imitation based framework, Interactive Language, that is capable of continuously adjusting its behavior to natural language based instructions in a real-time interactive setting. There have also been a recent wave of datasets and benchmarks created by utilizing 3D household simulators and crowd sourcing tools to collect large-scale task-oriented dialogue aimed at improving the interactive language capabilities of embodied task-oriented agents Padmakumar et al (2022), Gao et al (2022), Team et al (2021). Most of the above mentioned works focus on the verbal mode of communication and largely on the comprehension side (e.g., instruction following).…”
Section: Rl For Communication In Task-oriented Embodied Agentsmentioning
confidence: 99%
See 1 more Smart Citation
“…In Lynch et al (2022) the authors present a RL and imitation based framework, Interactive Language, that is capable of continuously adjusting its behavior to natural language based instructions in a real-time interactive setting. There have also been a recent wave of datasets and benchmarks created by utilizing 3D household simulators and crowd sourcing tools to collect large-scale task-oriented dialogue aimed at improving the interactive language capabilities of embodied task-oriented agents Padmakumar et al (2022), Gao et al (2022), Team et al (2021). Most of the above mentioned works focus on the verbal mode of communication and largely on the comprehension side (e.g., instruction following).…”
Section: Rl For Communication In Task-oriented Embodied Agentsmentioning
confidence: 99%
“…There have also been a recent wave of datasets and benchmarks created by utilizing 3D household simulators and crowd sourcing tools to collect large-scale task-oriented dialogue aimed at improving the interactive language capabilities of embodied task-oriented agents Padmakumar et al. (2022) , Gao et al. (2022) , Team et al.…”
Section: Introductionmentioning
confidence: 99%
“…Similarly, collects a large dataset of CRs to user requests, augmented synthetically, in a multiple-step process without interaction. Another large-scale dataset with 53k task-relevant questions and answers about an instruction was constructed Gao et al (2022). However, the data is created by an annotator that does not have to act, but only watches execution videos, asking a question they think would be helpful and then answering their own question.…”
Section: Related Literaturementioning
confidence: 99%
“…However, the reasoning mainly focuses on the outcome or the history of the navigation on 2D images and does not require a holistic 3D understanding of the environment. There are also works [12,20,51,54,57,69] targeting instruction following in embodied environments, in which an agent is asked to perform a series of tasks based on language instructions. Different from their settings, for our benchmark an embodied agent actively explores the environment and takes multi-view images for 3D-related reasoning.…”
Section: Related Workmentioning
confidence: 99%