Fuel cells are very sensitive to temperature. To enhance the temperature stability of fuel cells, in this article, a reinforcement learning control method is proposed to regulate the speed of the cooling water pump, and a waste heat recovery structure is designed based on this. First, the model's accuracy is validated by comparing it with experimental data. Subsequently, in the simulation results, it is shown that reinforcement learning control improves average temperature regulation capability by 5.6% compared to proportional‐integral‐derivative (PID) control. Following the incorporation of waste heat recovery, the energy consumption of the pump is reduced by 0.04 kWh in comparison to PID control. Similarly, for the air‐conditioning system, the energy consumption of the compressor is reduced by 0.63 kWh. For the same driving distance, hydrogen consumption decreases by 80 g. Additionally, the coefficient of performance increases by 1.5% in the waste heat recovery mode.