In recent years, industries have increasingly emphasized the need for high-speed, energy-efficient, and cost-effective solutions. As a result, there has been growing interest in developing flexible link manipulator robots to meet these requirements. However, reducing the weight of the manipulator leads to increased flexibility which, in turn, causes vibrations. This research paper introduces a novel approach for controlling the vibration and motion of a two-link flexible manipulator using reinforcement learning. The proposed system utilizes trust region policy optimization to train the manipulator’s end effector to reach a desired target position, while minimizing vibration and strain at the root of the link. To achieve the research objectives, a 3D model of the flexible-link manipulator is designed, and an optimal reward function is identified to guide the learning process. The results demonstrate that the proposed approach successfully suppresses vibration and strain when moving the end effector to the target position. Furthermore, the trained model is applied to a physical flexible manipulator for real-world control verification. However, it is observed that the performance of the trained model does not meet expectations, due to simulation-to-real challenges. These challenges may include unanticipated differences in dynamics, calibration issues, actuator limitations, or other factors that affect the performance and behavior of the system in the real world. Therefore, further investigations and improvements are recommended to bridge this gap and enhance the applicability of the proposed approach.