In recent research on 3D underwater wireless sensor network (UWSN), magnetic induction communication is a promising candidate, thanks to several unique features, such as small transmission delay, constant channel behavior, and adequate long communication range. However, designing a routing protocol that prolongs the network lifetime and reduces the transmission delay has been still a challenge for a 3D UWSN. In this paper, we propose an efficient routing protocol based on reinforcement learning, in particular, the Q-learning that aims to investigate the resource management in the hierarchical networks. Through defining the single hopping bonus metrics of distance and energy, we deduce the updating formula of the routing algorithm and derive the relationship between energy priority and distance priority. In addition, we set up a regulatory factor to adjust the proportion between energy saving and low delay, and thus, it can meet different needs. The simulation results show that the proposed routing approach outperforms the conventional protocol in extending the network lifetime and reducing the transmission delay.INDEX TERMS Underwater wireless sensor network, magnetic induction, routing protocol, reinforcement learning.