This paper discusses the implementation of a Deep Reinforcement Learning policy, based on DQN, which optimizes the navigation of the UAV to the front of wind turbine blades. The UAV was trained in simulation using Unreal Engine V4.27 coupled with AirSim. The action space of the UAV was discretized while allowing 6 different actions to be executed. A Yolov5 network trained with images of simulated wind turbines was used for detection and tracking, providing the DQN policy with state information, upon which it has been trained. In addition to this, the dynamic reward has been implemented, which combined both navigation and inspection objectives in the final evaluation of actions. Our tests showed that after 7500 time-steps the exploration rate reached near 0, the mean length of the episodes increased from 10 down to 30, but the mean reward increased from around -60 to stabilizing the output at 26. These results suggest that the proposed method is a promising solution to optimizing the autonomous inspection of wind turbines with UAVs.