In recent years, artificial intelligence has played an increasingly important role in the field of automated control of drones. After AlphaGo used Intensive Learning to defeat the World Go Championship, intensive learning gained widespread attention. However, most of the existing reinforcement learning is applied in games with only two or three moving directions. This paper proves that deep reinforcement learning can be successfully applied to an ancient puzzle game Nokia Snake after further processing. A game with four directions of movement. Through deep intensive learning and training, the Snake (or self-learning Snake) learns to find the target path autonomously, and the average score on the Snake Game exceeds the average score on human level. This kind of Snake algorithm that can find the target path autonomously has broad prospects in the industrial field, such as: UAV oil and gas field inspection, Use drones to search for and rescue injured people after a complex disaster. As we all know, post-disaster relief requires careful staffing and material dispatch. There are many factors that need to be considered in the artificial planning of disaster relief. Therefore, we want to design a drone that can search and rescue personnel and dispatch materials. Current drones are quite mature in terms of automation control, but current drones require manual control. Therefore, the Snake algorithm proposed here to be able to find the target path autonomously is an attempt and key technology in the design of autonomous search and rescue personnel and material dispatching drones. INDEX TERMS Deep reinforcement learning, Markov decision, Monte Carlo, Q-learning. The associate editor coordinating the review of this manuscript and approving it for publication was Zhanyu Ma.