The utilization of UAVs in various fields has led to the development of flying ad hoc network (FANET) technology. In a network environment with highly dynamic topology and frequent link changes, the traditional routing technology of FANET cannot satisfy the new communication demands. Traditional routing algorithm, based on geographic location, can "fall" into a routing hole. In view of this problem, we propose a geolocation routing protocol based on multi-agent reinforcement learning, which decreases the packet loss rate and routing cost of the routing protocol. The protocol views each node as an intelligent agent and evaluates the value of its neighbor nodes through the local information. In the value function, nodes consider information such as link quality, residual energy and queue length, which reduces the possibility of a routing hole. The protocol uses global rewards to enable individual nodes to collaborate in transmitting data. The performance of the protocol is experimentally analyzed for UAVs under extreme conditions such as topology changes and energy constraints. Simulation results show that our proposed QLGR-S protocol has advantages in performance parameters such as throughput, end-to-end delay, and energy consumption compared with the traditional GPSR protocol. QLGR-S provides more reliable connectivity for UAV networking technology, safeguards the communication requirements between UAVs, and further promotes the development of UAV technology.