The generation of the pitch control signal in a wind turbine (WT) is not straightforward due to the nonlinear dynamics of the system and the coupling of its internal variables; in addition, they are subjected to the uncertainty that comes from the random nature of the wind. Fuzzy logic has proved useful in applications with changing system parameters or where uncertainty is relevant as in this one, but the tuning of the fuzzy logic controller (FLC) parameters is neither straightforward nor an easy task. On the other hand, reinforcement learning (RL) allows systems to automatically learn, and this capability can be exploited to tune the FLC. In this work, a WT pitch control architecture that uses RL to tune the membership functions and scale the output of a fuzzy controller is proposed. The RL strategy calculates the fuzzy controller gains in order to reduce the output power error of the WT according to the wind speed. Different reward mechanisms based on the output power error have been considered. Simulation results with different wind profiles show that this architecture performs better (123.7 W) in terms of power errors than an FLC without RL (133.2 W) or a simpler PID (208.8 W). Even more, it provides a smooth response and outperforms other hybrid controllers such as RL-PID and radial basis function neural network control.