“…The trial-and-error learning through the interaction with the environment and not requiring pre-collected data and prior expert knowledge allows RL algorithms to adapt to uncertain conditions, which is also discussed by Panzer and Bender (2022). Some applications can be found in manufacturing, for instance, in scheduling tasks as an example demonstrated by Dong, Xue, Xiao and Li (2020), maintenance as a case study researched by Rodríguez, Kubler, de Giorgio, Cordy, Robert and Le Traon (2022); Yousefi, Tsianikas and Coit (2022), process control described by the authors Spielberg, Tulsyan, Lawrence, Loewen and Gopaluni (2020), energy management example elaborated by Lu, Li, Li, Jiang and Ding (2020), assembly task mentioned by Tortorelli, Imran, Delli Priscoli and Liberati (2022), and robot manipulation that in detail has been discussed by Beltran-Hernandez, Petit, Ramirez-Alpizar and Harada (2020); Schoettler, Nair, Luo, Bahl, Ojea, Solowjow and Levine (2020).…”