To improve the power performance and fuel economy and reduce the online computation time of hybrid electric vehicles (HEVs) in off-road driving environments, a real-time energy management strategy based on the expected state-action-reward-state’-action’ (SARSA) algorithm is proposed. This strategy achieves the control effect through coordinated control of the engine generator set (EGS) and the energy distribution between the battery and the EGS. The driving environment is expressed as a transition probability matrix (TPM) of the electric power demand, and the optimal control problem of energy management is solved offline by the expected SARSA algorithm to obtain the control laws. Then, the Kullback-Leibler (KL) divergence rate is used as the evaluation index of the switching control strategy, and these control laws are used online to realize real-time control. Control strategies based on traditional reinforcement learning (RL), stochastic dynamic programming (SDP), and dynamic programming (DP) are considered benchmark strategies to verify the effectiveness of the proposed energy management strategy based on the expected SARSA. The fuel economy is improved by a maximum of 13.2%, and the elapsed time is reduced by more than 99.8% compared with the values achieved with the benchmark algorithms. These results show that the proposed strategy has flexible adaptability to complex and changeable off-road conditions without prior knowledge of the whole driving cycle.