The yaw angle control of a wind turbine allows maximization of the power absorbed from the wind and, thus, the increment of the system efficiency. Conventionally, classical control algorithms have been used for the yaw angle control of wind turbines. Nevertheless, in recent years, advanced control strategies have been designed and implemented for this purpose. These advanced control strategies are considered to offer improved features in comparison to classical algorithms. In this paper, an advanced yaw control strategy based on reinforcement learning (RL) is designed and verified in simulation environment. The proposed RL algorithm considers multivariable states and actions, as well as the mechanical loads due to the yaw rotation of the wind turbine nacelle and rotor. Furthermore, a particle swarm optimization (PSO) and Pareto optimal front (PoF)-based algorithm have been developed in order to find the optimal actions that satisfy the compromise between the power gain and the mechanical loads due to the yaw rotation. Maximizing the power generation and minimizing the mechanical loads in the yaw bearings in an automatic way are the objectives of the proposed RL algorithm. The data of the matrices Q (s,a) of the RL algorithm are stored as continuous functions in an artificial neural network (ANN) avoiding any quantification problem. The NREL 5-MW reference wind turbine has been considered for the analysis, and real wind data from Salt Lake, Utah, have been used for the validation of the designed yaw control strategy via simulations with the aeroelastic code FAST.