Flash calculations are essential in reservoir engineering applications, most notably in compositional flow simulation and separation processes, to provide phase distribution factors, known as k-values, at a given pressure and temperature. The calculation output is subsequently used to estimate composition-dependent properties of interest, such as the equilibrium phases’ molar fraction, composition, density, and compressibility. However, when the flash conditions approach criticality, minor inaccuracies in the computed k-values may lead to significant deviation in the dependent properties, which is eventually inherited to the simulator, leading to large errors in the simulation. Although several machine-learning-based regression approaches have emerged to drastically accelerate flash calculations, the criticality issue persists. To address this problem, a novel resampling technique of the ML models’ training data population is proposed, which aims to fine-tune the training dataset distribution and optimally exploit the models’ learning capacity across various flash conditions. The results demonstrate significantly improved accuracy in predicting phase behavior results near criticality, offering valuable contributions not only to the subsurface reservoir engineering industry but also to the broader field of thermodynamics. By understanding and optimizing the model’s training, this research enables more precise predictions and better-informed decision-making processes in domains involving phase separation phenomena. The proposed technique is applicable to every ML-dominated regression problem, where properties dependent on the machine output are of interest rather than the model output itself.