This paper proposes a deep reinforcement learning algorithm to achieve complete coverage path planning for deep‐sea mining vehicle clusters. First, the mining vehicles and the deep‐sea mining environment are modeled. Then, this paper implements a series of algorithm designs and optimizations based on Deep Q Networks (DQN). The map fusion mechanism can integrate the grid matrix data from multiple mining vehicles to get the state matrix of the complete environment. In this paper, a preprocessing method for the state matrix is also designed to provide suitable training data for the neural network. The reward function and action selection mechanism of the algorithm are also optimized according to the requirements of cluster cooperative operation. Furthermore, the algorithm uses distance constraints to prevent the entanglement of underwater hoses. To improve the training efficiency of the neural network, the algorithm filters and extracts training samples for training through the sample quality score. Considering the requirement of cluster complete coverage mission, this paper introduces Long Short‐Term Memory (LSTM) based on the neural network to achieve a better training effect. After completing the above optimization and design, the algorithm proposed in this paper is verified through simulation experiments.