From the elements that contribute to increasing the reliability of the control system of a Permanent Magnet Synchronous Motor (PMSM), could be enumerated the replacement of the speed transducers with software-implemented speed observers. On this basis, this paper focuses on the sensorless regulation of a PMSM using a Direct Torque Control (DTC) type strategy, in which a speed observer is used in combination with a Reinforcement Learning-Twin Delayed Deep Deterministic Policy Gradient (RL-TD3) type agent to increase the accuracy of the PMSM rotor speed estimate. Simulated Annealing (SA) is also used to achieve superior performance of the velocity observer for optimal tuning of the PI-type velocity controller parameters. The latter can, after the training phase, provide correction signals to the speed observer so that the estimated speed is as close as possible to the estimated speed. The control structures, control algorithms, and operating equations of the PMSM, the control strategy of the DTC type, and the speed observer are presented in this paper. Numerical simulations carried out in the Matlab/Simulink programming environment validate the superiority of the PMSM rotor speed estimation performance in the case of the use of an RL-TD3-type agent in combination with a speed observer, compared to the case of the use of the speed observer alone.