DeepNav: Learning to Navigate Large Cities

Brahmbhatt, Samarth; Hays, James

doi:10.1109/cvpr.2017.329

Cited by 49 publications

(37 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep RL has been successfully applied to the navigation problem for robots, including visual navigation with simplified navigation controllers [7], [14], [25], [50], [56], [64], more realistic controllers environments in gamelike environments [6], [13], [48], extracting navigation features from realistic environments [10], [23]. In the local planner setting similar to ours, differential drive robot with 1-d lidar sensing several approaches emerged recently using asynchronous DDPG [59], expert demonstrations [54], DDPG [42], and curriculum learning [62], and AutoRL [12].…”

Section: Related Workmentioning

confidence: 99%

“…Recently, reinforcement learning (RL) agents [36] have solved complex robot control problems [61], generated trajectories under task constraints [20], demonstrated robustness to noise [19], and learned complex skills [51] [49], making them good choices to deal with task constraints. Many simple navigation tasks require only low-dimensional sensors and controls, like lidar and differential drive, and can be solved with easily trainable networks [63], [25], [7]. However, as we increase complexity of the problem by requiring longer episodes or providing only sparse rewards [18], RL agents become more difficult to train, and RL doesn't always transfer well to new environments [30] [29].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Long-Range Indoor Navigation With PRM-RL

et al. 2020

View full text Add to dashboard Cite

Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map where these agents can reliably navigate in simulation; these roadmaps and agents are then deployed on-robot, guiding the robot along the shortest path where the agents are likely to succeed. Here we use Probabilistic Roadmaps (PRMs) as the sampling-based planner and AutoRL as the reinforcement learning method in the indoor navigation context. We evaluate the method in simulation for kinematic differential drive and kinodynamic car-like robots in several environments, and on-robot for differential-drive robots at two physical sites. Our results show PRM-RL with AutoRL is more successful than several baselines, is robust to noise, and can guide robots over hundreds of meters in the face of noise and obstacles in both simulation and on-robot, including over 3.3 kilometers of physical robot navigation. The video is available at https://youtu.be/xN-OWX5gKvQ

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Long-Range Indoor Navigation With PRM-RL

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Some works [12], [13] utilized auxiliary tasks during training to improve navigation performance. Others either took the recurrent neural network (RNN) to represent the memory [4], [14]- [16] or predicted navigational actions directly from visual observations [8], [17], [18].…”

Section: A Visual Navigationmentioning

confidence: 99%

Revisiting EmbodiedQA: A Simple Baseline and Beyond

Jiang

Yang

2020

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for the contemporary approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In this new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation. Index Terms-Embodied question answering, vision and language, visual question answering. I. INTRODUCTION A LONG-STANDING goal of artificial intelligence is to develop agents that can perceive and interact with the environment and communicate with humans in natural language. A representative research area is studying a goal-driven agent that can communicate with humans (language), perceive the environment (vision), and explore the space (taking actions). This paper focuses on a kind of such problem called Embodied Question Answering (EmbodiedQA) [1], a sub-field derived from Visual Question Answering (VQA), where users could ask an agent questions, and to answer these questions, the agent needs to perform actions to navigate the environment and collect evidence. A key difference to related problems, such as visual navigation [2]-[4], is that the agent is only given the first-person view and has no access to the global map of the environment nor the room/object layout in the environment. The example in Fig. 1 illustrates this challenging setting where the agent needs to answer questions about an object at a random location in the environment.

show abstract

“…On the other hand, current DOBs might not have insufficient capability in estimating fast time-varying disturbances, since their convergence analysis often assume disturbances time-invariant. In addition, such separated processes of disturbance estimation, disturbance prediction, and control optimization might not be able to produce estimations and control signals that are mutual robust to each other and that jointly optimize performance, as evidenced in [6,28].…”

Section: Disturbance Rejectionmentioning

confidence: 99%

A2: Extracting cyclic switchings from DOB-nets for rejecting excessive disturbances

Liu

2020

Neurocomputing

View full text Add to dashboard Cite

DeepNav: Learning to Navigate Large Cities

Cited by 49 publications

References 33 publications

Long-Range Indoor Navigation With PRM-RL

Long-Range Indoor Navigation With PRM-RL

Revisiting EmbodiedQA: A Simple Baseline and Beyond

A2: Extracting cyclic switchings from DOB-nets for rejecting excessive disturbances

Contact Info

Product

Resources

About