In order to enhance the take-off lift of a butterfly-like flapping wing vehicle, we implemented an integrated experimental platform and applied a reinforcement learning algorithm. The vehicle, which has a wingspan of 81cm and is mounted on a stand with a force sensor, is driven by two servos that are powered and controlled wirelessly. To achieve the goal of enhancing take-off lift, we used a model-free, on-policy actor-critic PPO algorithm. After 300 learning steps, the average aerodynamic lift force increased significantly from 0.044 N to 0.861 N. This enhanced lift force was sufficient to meet the take-off requirements of the vehicle without the need for any additional aids or airflow. Additionally, we observed a strong lift peak in the upstroke after analyzing the learning results. Further experiments showed that this lift peak is directly related to the elastic release of the wing twist and the opening and closing of the gap between the forewing and hindwing in the early stage of the upstroke. These findings were not easily predicted or discovered using traditional aerodynamic methods. This work provides valuable reinforcement learning experience for the future development of flapping-wing vehicles.