Reducing energy consumption under processors' temperature constraints has recently become a pressing issue in real‐time multiprocessor systems on chips (MPSoCs). The high temperature of processors affects the power and reliability of the MPSoC. Low energy consumption is necessary for real‐time embedded systems, as most of them are portable devices. Efficient task mapping on processors has a significant impact on reducing energy consumption and the thermal profile of processors. Several state‐of‐the‐art techniques have recently been proposed for this issue. This paper proposes Q‐scheduler, a novel technique based on the deep Q‐learning technology, to dispatch tasks between processors in a real‐time MPSoC. Thousands of simulated tasks train Q‐scheduler offline to reduce the system's power consumption under temperature constraints of processors. The trained Q‐scheduler dispatches real tasks in a real‐time MPSoC online while also being trained regularly online. Q‐scheduler dispatches multiple tasks in the system simultaneously with a single process; the effectiveness of this ability is significant, especially in a harmonic real‐time system. Experimental results illustrate that Q‐scheduler reduces energy consumption and temperature of processors on average by 15% and 10%, respectively, compared to previous state‐of‐the‐art techniques.