Cyclic air braking is a key factor affecting the safe operation of trains on long downhill sections. However, a train’s cycle braking strategy is constrained by multiple factors such as driving environment, speed, and air-refilling time. A Q-learning algorithm-based cyclic braking strategy for a heavy haul train on long downhill sections is proposed to address this challenge. First, the operating environment of a heavy haul train on long downhill sections is designed, considering various constraint parameters, such as the characteristics of special operating routes, allowable operating speeds, and train tube air-refilling time. Second, the operating status and braking operation of a heavy haul train on long downhill sections are discretized in order to establish a Q-table based on state–action pairs. The training of algorithm performance is achieved by continuously updating Q-tables. Finally, taking the heavy haul train formation as the study object, actual line data from the Shuozhou–Huanghua Railway are used for experimental simulation, and different hyperparameters and entry speed conditions are considered. The results show that the safe and stable cyclic braking of a heavy haul train on long downhill sections is achieved. The effectiveness of the Q-learning control strategy is verified.