It is noticed that offline-training and online-implementation method is dominant in the datadriven control. However, the inconsistence existing in offline data and online data may degrade the control performance. To address the aforementioned issue, an online control strategy is developed so that the control parameters can be updated online based on the real-time data measured to ensure satisfactory control performance in this study. Specifically, an online control algorithm is addressed to control the pressing-down speed of the forging machine based on the framework of the reinforcement learning that has a capability of building a complete mapping from state space to action space only according to the neighbour samples. Rather than using the way of trials and errors which is too slow to be online implementation, a taboo search is addressed to speed up the learning-working process by directly searching the control on the current states, followed by the stability conditions, derived from Lyapunov stability theory. A coarse model that is limited to get the cost information of the reinforcement learning is used to make the best of mechanism information, which prevents the occurrence of the invalid states that do not conform to system characteristics. The effectiveness of the algorithm is demonstrated by an ultra-low forging machine, which outperforms the conventional approaches such as PID and neural network control approaches.