“…Reinforcement learning (RL) is an artificial intelligence technique with relevant applications in robotics [8,15,[28][29][30]37], path planning [20,39,47,59,75,76] and combinatorial optimization problems [4,7,13,14,21,44,53,54,64,79], such as the TSP [1,2,18,22,41,45,52,66,81]. In RL, an agent learns from rewards and penalties in interacting with an environment [68].…”