A deep reinforcement learning method to achieve complete coverage path planning for an unmanned surface vehicle (USV) is proposed. This paper firstly models the USV and the workspace required for complete coverage. Then, for the full-coverage path planning task, this paper proposes a preprocessing method for raster maps, which can effectively delete the blank areas that are impossible to cover in the raster map. In this paper, the state matrix corresponding to the preprocessed raster map is used as the input of the deep neural network. The deep Q network (DQN) is used to train the complete coverage path planning strategy of the agent. The improvement of the selection of random actions during training is first proposed. Considering the task of complete coverage path planning, this paper replaces random actions with a set of actions toward the nearest uncovered grid. To solve the problem of the slow convergence speed of the deep reinforcement learning network in full-coverage path planning, this paper proposes an improved method of deep reinforcement learning, which superimposes the final output layer with a dangerous actions matrix to reduce the risk of selection of dangerous actions of USVs during the learning process. Finally, the designed method validates via simulation examples.
With the continued development of artificial intelligence technology, unmanned surface vehicles (USVs) have attracted the attention of countless domestic and international specialists and academics. In particular, path planning is a core technique for the autonomy and intelligence process of USVs. The current literature reviews on USV path planning focus on the latest global and local path optimization algorithms. Almost all algorithms are optimized by concerning metrics such as path length, smoothness, and convergence speed. However, they also simulate environmental conditions at sea and do not consider the effects of sea factors, such as wind, waves, and currents. Therefore, this paper reviews the current algorithms and latest research results of USV path planning in terms of global path planning, local path planning, hazard avoidance with an approximate response, and path planning under clustering. Then, by classifying USV path planning, the advantages and disadvantages of different research methods and the entry points for improving various algorithms are summarized. Among them, the papers which use kinematic and dynamical equations to consider the ship’s trajectory motion planning for actual sea environments are reviewed. Faced with multiple moving obstacles, the literature related to multi-objective task assignment methods for path planning of USV swarms is reviewed. Therefore, the main contribution of this work is that it broadens the horizon of USV path planning and proposes future directions and research priorities for USV path planning based on existing technologies and trends.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.