Cyber Physical Production Systems (CPPS) provide a huge amount and variety of process and production data. Simultaneously, operational decisions are getting ever more complex due to smaller batch sizes (down to batch size one), a larger product variety and complex processes in production systems. Production engineers struggle to utilize the recorded data to optimize production processes effectively. In contrast, CPPS promote decentralized decision-making, so-called intelligent agents that are able to gather data (via sensors), process these data, possibly in combination with other information via a connection to and exchange with others, and finally take decisions into action (via actors). Modular and decentralized decision-making systems are thereby able to handle far more complex systems than rigid and static architectures. This paper discusses possible applications of Machine Learning (ML) algorithms, in particular Reinforcement Learning (RL), and the potentials towards an production planning and control aiming for operational excellence.