The massive data generated by large-scale dynamic systems makes its optimization facing a tough challenge. Traditional White Box-based methods directly model the internal operating mechanism of the system, so massive amounts of measured data need to be handled, which is costly and time-consuming. The poor interpretability of the Black Box-based methods makes it difficult to adapt to the dynamic environment. Thus we propose a novel Gray Box-based approach namely Deep Reinforcement Learning-enabled Constraint Set Inversion Algorithm (DRESIA), which establishes a quantitative model of the nonlinear interoperability effects of system internal states which simplifies the White Box's complex mechanism of reconstruction and prediction and retains the interpretability of the model, therefore improves the prediction efficiency of feasible region while also improving the generalization ability. It further improves the dynamic adaptability of the modeling environment, which provides a new performance balancing scheme for system modeling. Under the premise that the large-scale 5G Cyber-Twin system satisfies the given Quality of Service(QoS) requirements, we perform DRESIA to realize the efficient and dynamic optimal search of feasible region, the results show that the DRESIA reduces the computational cost, and balances the accuracy and robustness of the feasible region, which validate the effectiveness and superiority of Gray Box-based approach.