“…There have already been some initial attempts to explore reinforcement learning for restricted tasks in scheduling, routing, and network optimisation. [23,24,25,26,27,28,29,30,31,32] Our approach differs from these since it offers a practical application for RL in a real-world online environment. In this application RL will not only adapt to the broad properties of the problem but also to the individual properties of the equipment used.…”