In ex vivo heart perfusion (EVHP), the control of aortic pressure (AoP) is critical for maintaining the heart’s physiologic aerobic metabolism. However, the complexity of and variability in cardiac parameters present a challenge in achieving the rapid and accurate regulation of AoP. In this paper, we propose a method of AoP control based on deep reinforcement learning for EVHP in Langendorff mode, which can adapt to the variations in cardiac parameters. Firstly, a mathematical model is developed by coupling the coronary artery and the pulsatile blood pump models. Subsequently, an aortic pressure control method based on the Deep Deterministic Policy Gradient (DDPG) algorithm is proposed. This method enables the regulation of the blood pump and the realization of closed-loop control. The control performance of the proposed DDPG method, the traditional proportional–integral–derivative (PID) method, and the fuzzy PID method are compared by simulating single and mixed changes in mean aortic pressure target values and coronary resistance. The proposed method exhibits superior performance compared to the PID and fuzzy PID methods under mixed factors, with 68.6% and 66.4% lower settling times and 70.3% and 54.1% lower overshoot values, respectively. This study demonstrates that the proposed DDPG-based method can respond more rapidly and accurately to different cardiac conditions than the conventional PID controllers.