Natural Language Communication with Robots

Bisk, Yonatan; Yüret, Deniz; Marcu, Daniel

doi:10.18653/v1/n16-1089

Cited by 97 publications

(158 citation statements)

References 19 publications

(16 reference statements)

Supporting

Mentioning

155

Contrasting

Order By: Relevance

“…This is an improvement over both rule based benchmark with 1.54 and the best model reported by Bisk et al (2016b), who had 0.98. The median distance is 0.04 which is much better than their comparable End-To-End model with median distance 0.53.…”

Section: Resultsmentioning

confidence: 77%

“…In this paper, we propose several models solving this task and report improvement compared to the previous work by Bisk et al (2016b).…”

Section: Introductionmentioning

confidence: 99%

“…The first approach using neural networks is proposed by Bisk et al (2016b), who describe and compare several neural models for understanding natural language commands. Their dataset (Bisk et al, 2016a) contains simulated world with square blocks and actions descriptions in English (see Figure 1).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Communication with Robots using Multilayer Recurrent Networks

Pišl¹,

Mareċek

2017

Proceedings of the First Workshop on Language Grounding for Robotics

View full text Add to dashboard Cite

show abstract

Section: Resultsmentioning

confidence: 77%

“…In this paper, we propose several models solving this task and report improvement compared to the previous work by Bisk et al (2016b).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Communication with Robots using Multilayer Recurrent Networks

Pišl¹,

Mareċek

2017

Proceedings of the First Workshop on Language Grounding for Robotics

View full text Add to dashboard Cite

show abstract

“…Environment We use the environment of Bisk et al (2016). The original task required predicting the source and target positions for a single block given an instruction.…”

Section: Methodsmentioning

confidence: 99%

“…This approach offers multiple benefits, such as not requiring intermediate representations, planning procedures, or training multiple models. Figure 1 illustrates the problem in the Blocks environment (Bisk et al, 2016). The agent observes the environment as an RGB image using a camera sensor.…”

Section: Introductionmentioning

confidence: 99%

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

Misra¹,

Langford²,

Artzi³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

141

126

View full text Add to dashboard Cite

We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use a pipeline of separately trained models, we learn a single model to jointly reason about linguistic and visual input. We use reinforcement learning in a contextual bandit setting to train a neural network agent. To guide the agent's exploration, we use reward shaping with different forms of supervision. Our approach does not require intermediate representations, planning procedures, or training different models. We evaluate in a simulated environment, and show significant improvements over supervised learning and common reinforcement learning variants.

show abstract