Path planning is critical for planetary rovers that perform observation and exploration missions in unknown and dangerous environment. And due to the communication delay, it is difficult for the planet rover to receive instructions from Earth in time to guide its own movement. In this work, we present a novel neural network-based algorithm to solve the global path planning problem for planetary rovers. Inspired by feature pyramid networks used for object detection, we construct a deep neural network model, termed the Pyramid Path Planning Network (P3N), which has a well-designed backbone that efficiently learns a global feature representation of the environment, and a feature pyramid branch that adaptively fuses multi-scale features from different levels to generate the local feature representation with rich semantic information. The P3N learns environmental dynamics from terrain images of planetary surface taken by satellites, without using additional elevation information to construct an explicit environmental model in advance, and can perform path planning policy after end-to-end training. We evaluate the effectiveness of the proposed method on synthetic grid maps and a realistic data set constructed from the lunar terrain images. Experimental results demonstrate that our P3N has higher prediction accuracy and faster computation speed compared to the baseline methods, and generalize better in large-scale environments.INDEX TERMS Deep learning, feature pyramid network, global path planning, multi-scale feature fusion, planetary rover.
Path planning technology is significant for planetary rovers that perform exploration missions in unfamiliar environments. In this work, we propose a novel global path planning algorithm, based on the value iteration network (VIN), which is embedded within a differentiable planning module, built on the value iteration (VI) algorithm, and has emerged as an effective method to learn to plan. Despite the capability of learning environment dynamics and performing long-range reasoning, the VIN suffers from several limitations, including sensitivity to initialization and poor performance in large-scale domains. We introduce the double value iteration network (dVIN), which decouples action selection and value estimation in the VI module, using the weighted double estimator method to approximate the maximum expected value, instead of maximizing over the estimated action value. We have devised a simple, yet effective, two-stage training strategy for VI-based models to address the problem of high computational cost and poor performance in large-size domains. We evaluate the dVIN on planning problems in grid-world domains and realistic datasets, generated from terrain images of a moon landscape. We show that our dVIN empirically outperforms the baseline methods and generalize better to large-scale environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.