The motion control system of a lower-limb exoskeleton rehabilitation robot (LLERR) is designed to assist patients in lower-limb rehabilitation exercises. This research designed a motion controller for an LLERR-based on the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm to control the lower-limb exoskeleton for gait training in a staircase environment. Commencing with the establishment of a mathematical model of the LLERR, the dynamics during its movement are systematically described. The TD3 algorithm is employed to plan the motion trajectory of the LLERR’s right-foot sole, and the target motion curve of the hip (knee) joint is deduced inversely to ensure adherence to human physiological principles during motion execution. The control strategy of the TD3 algorithm ensures that the movement of each joint of the LLERR is consistent with the target motion trajectory. The experimental results indicate that the trajectory tracking errors of the hip (knee) joints are all within 5°, confirming that the LLERR successfully assists patient in completing lower-limb rehabilitation training in a staircase environment. The primary contribution of this study is to propose a non-linear control strategy tailored for the staircase environment, enabling the planning and control of the lower-limb joint motions facilitated by the LLERR.