This paper aims at the characteristics of nonlinear, time-varying and parameter coupling in a hydraulic servo system. An intelligent control method is designed that uses self-learning without a model or prior knowledge, in order to achieve certain control effects. The control quantity can be obtained at the current moment through the continuous iteration of a strategy–value network, and the online self-tuning of parameters can be realized. Taking the hydraulic servo system as the experimental object, a twin delayed deep deterministic (TD3) policy gradient was used to reinforce the learning of the system. Additionally, the parameter setting was compared using a deep deterministic policy gradient (DDPG) and a linear–quadratic–Gaussian (LQG) based on linear quadratic Gaussian objective function. To compile the reinforcement learning algorithm and deploy it to the test platform controller for testing, we used the Speedgoat prototype target machine as the controller to build the fast prototype control test platform. MATLAB/Coder and compute unified device architecture (CUDA) were used to generate an S-function. The results show that, compared with other parameter tuning methods, the proposed algorithm can effectively optimize the controller parameters and improve the dynamic response of the system when tracking signals.