Disentangled Relational Representations for Explaining and Learning from Demonstration

Hristov, Yordan; Angelov, Daniel; Burke, Michael G.; Lascarides, Ālex; Ramamoorthy, Subramanian

doi:10.48550/arxiv.1907.13627

Cited by 2 publications

(3 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reasoning about spatial references has been explored in various contexts such as instruction following for 2D and 3D navigation (MacMahon et al, 2006;Vogel and Jurafsky, 2010;Chen and Mooney, 2011;Artzi and Zettlemoyer, 2013;Kim and Mooney, 2013;Andreas and Klein, 2015;Fried et al, 2018;Liu et al, 2019;Jain et al, 2019;Gaddy and Klein, 2019;Hristov et al, 2019;Chen et al, 2019) and situated dialog for robotic manipulation (Skubic et al, 2002;Kruijff et al, 2007;Kelleher and Costello, 2009;Landsiedel et al, 2017). Most of these approaches utilize supervised data, either in the form of policy demonstrations or target geometric representations.…”

Section: Spatial Reasoning In Textmentioning

confidence: 99%

“…'above', 'below') to perceptual processes like visual signals. While such early grounding efforts were limited by computational bottlenecks, several deep neural architectures have been recently proposed that jointly process text and visual input (Janner et al, 2017;Misra et al, 2017;Bisk et al, 2016;Liu et al, 2019;Jain et al, 2019;Gaddy and Klein, 2019; arXiv:2005.00696v1 [cs.CL] 2 May 2020 Hristov et al, 2019;Yu et al, 2018). While these approaches have made significant advances in improving the ability of agents at following spatial instructions, they are either not easily interpretable or require pre-specified parameterization to induce interpretable modules (Bisk et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Robust and Interpretable Grounding of Spatial References with Relation Networks

Yang

Lan²,

Narasimhan³

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Learning representations of spatial references in natural language is a key challenge in tasks like autonomous navigation and robotic manipulation. Recent work has investigated various neural architectures for learning multi-modal representations for spatial concepts. However, the lack of explicit reasoning over entities makes such approaches vulnerable to noise in input text or state observations. In this paper, we develop effective models for understanding spatial references in text that are robust and interpretable, without sacrificing performance. We design a text-conditioned relation network whose parameters are dynamically computed with a cross-modal attention module to capture fine-grained spatial relations between entities. This design choice provides interpretability of learned intermediate outputs. Experiments across three tasks demonstrate that our model achieves superior performance, with a 17% improvement in predicting goal locations and a 15% improvement in robustness compared to state-of-the-art systems. 1

show abstract

Section: Spatial Reasoning In Textmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Robust and Interpretable Grounding of Spatial References with Relation Networks

Yang

Lan²,

Narasimhan³

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

show abstract

“…[18] has shown that there exist semantics in the latent space of generative adverserial networks (GANs), and [19] successfully decomposes the latent factor in a GAN into structured semantic parts. In addition to GANs, [20] has learned disentangled latent representations in a variational autoencoder (VAE) framework to ground spatial relations between objects. Unlike the information bottleneck motivation of [19], we use metric learning [8] to capture information such as maneuvers and interactions.…”

Section: A Related Workmentioning

confidence: 99%

DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling

Huang

McGill²,

DeCastro³

et al. 2019

Preprint

View full text Add to dashboard Cite

Vehicle trajectory prediction is crucial for autonomous driving and advanced driver assistant systems. While existing approaches may sample from a predicted distribution of vehicle trajectories, they lack the ability to explore it -a key ability for evaluating safety from a planning and verification perspective. In this work, we devise a novel approach for generating realistic and diverse vehicle trajectories. We extend the generative adversarial network (GAN) framework with a low-dimensional approximate semantic space, and shape that space to capture semantics such as merging and turning. We sample from this space in a way that mimics the predicted distribution, but allows us to control coverage of semantically distinct outcomes. We validate our approach on a publicly available dataset and show results that achieve state of the art prediction performance, while providing improved coverage of the space of predicted trajectory semantics.

show abstract

Disentangled Relational Representations for Explaining and Learning from Demonstration

Cited by 2 publications

References 20 publications

Robust and Interpretable Grounding of Spatial References with Relation Networks

Robust and Interpretable Grounding of Spatial References with Relation Networks

DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling

Contact Info

Product

Resources

About