In wireless sensor networks, optimizing the network lifetime is an important issue. Most of the existing works define network lifetime as the time when the first sensor node exhausts all of its energy. However, such time is not necessarily important. This is because when a sensor node dies, the whole network is likely to work properly. In this article, we first make an overall consideration of the demand of applications and define the network lifetime in three aspects. Then, we construct a performance evaluation framework for routing protocols. To achieve the optimization of network lifetime in all defined aspects, we propose a reinforcement-learning-based routing protocol. Reinforcement-learning-based routing protocol takes advantage of the intelligent algorithm of reinforcement learning to search for the optimal routing path for data transmission. In the definition of reward function, factors such as link distance, residual energy, and hop count to the sink are taken into account to cut down the total energy consumption, balance the energy consumption, and improve the packet delivery. Simulation results demonstrate that compared with energy-aware routing, BEER, Q-Routing, and MRL-SCSO, reinforcement-learning-based routing protocol optimizes the network lifetime in three aspects and improves the energy efficiency.