Ayzaan Wahid scite author profile

What is a good visual representation for navigation? We study this question in the context of semantic visual navigation, which is the problem of a robot finding its way through a previously unseen environment to a target object, e.g. go to the refrigerator. Instead of acquiring a metric semantic map of an environment and using planning for navigation, our approach learns navigation policies on top of representations that capture spatial layout and semantic contextual cues.We propose to use semantic segmentation and detection masks as observations obtained by state-of-the-art computer vision algorithms and use a deep network to learn the navigation policy. The availability of equitable representations in simulated environments enables joint training using real and simulated data and alleviates the need for domain adaptation or domain randomization commonly used to tackle the sim-toreal transfer of the learned policies. Both the representation and the navigation policy can be readily applied to real nonsynthetic environments as demonstrated on the Active Vision Dataset [1]. Our approach successfully gets to the target in 54% of the cases in unexplored environments, compared to 46% for a non-learning based approach, and 28% for a learning-based baseline.

show abstract

PaLM-E: An Embodied Multimodal Language Model

Driess¹,

Xia²,

Sajjadi³

et al. 2023

Preprint

View full text Add to dashboard Cite

Visual Representations for Semantic Target Driven Navigation

Mousavian¹,

Toshev²,

Fišer³

et al. 2018

Preprint

View full text Add to dashboard Cite

Long Range Neural Navigation Policies for the Real World

Wahid

Toshev

Fišer

et al. 2019

View full text Add to dashboard Cite

Learned Neural Network based policies have shown promising results for robot navigation. However, most of these approaches fall short of being used on a real robot due to the extensive simulated training they require. These simulations lack the visuals and dynamics of the real world, which makes it infeasible to deploy on a real robot. We present a novel Neural Net based policy, NavNet, which allows for easy deployment on a real robot. It consists of two sub policies -a high level policy which can understand real images and perform long range planning expressed in high level commands; a low level policy that can translate the long range plan into low level commands on a specific platform in a safe and robust manner. For every new deployment, the high level policy is trained on an easily obtainable scan of the environment modeling its visuals and layout. We detail the design of such an environment and how one can use it for training a final navigation policy. Further, we demonstrate a learned low-level policy. We deploy the model in a large office building and test it extensively, achieving 0.80 success rate over long navigation runs and outperforming SLAM-based models in the same settings.

show abstract

Interactive Language: Talking to Robots in Real Time

Lynch¹,

Wahid²,

Tompson³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ayzaan Wahid

Visual Representations for Semantic Target Driven Navigation

PaLM-E: An Embodied Multimodal Language Model

Visual Representations for Semantic Target Driven Navigation

Long Range Neural Navigation Policies for the Real World

Interactive Language: Talking to Robots in Real Time

Contact Info

Product

Resources

About