In today’s rapidly changing production landscape with increasingly complex manufacturing processes and shortening product life cycles, a company’s competitiveness depends on its ability to design flexible and resilient production processes. On the shop-floor, in particular, the production control plays a crucial role in coping with disruptions and maintaining system stability and resilience. To address challenges arising from volatile sales markets or other factors, deep learning algorithms have been increasingly applied in production to facilitate fast-paced operations. In particular deep reinforcement learning frequently surpassed conventional and intelligent approaches in terms of performance and computational efficiency and revealed high levels of control adaptability. However, existing approaches were often limited in scope and scenario-specific, which hinders a seamless transition to other control optimization problems. In this paper, we propose a flexible framework that integrates a deep learning based hyper-heuristic into modular production to optimize pre-defined performance indicators. The framework deploys a module recognition and agent experience sharing, enabling a fast initiation of multi-level production systems as well as resilient control strategies. To minimize computational and re-training efforts, a stack of trained policies is utilized to facilitate an efficient reuse of previously trained agents. Benchmark results reveal that our approach outperforms conventional rules in terms of multi-objective optimization. The simulation framework further encourages research in deep-learning-based control approaches to leverage explainability.