Machine learning thermodynamic perturbation theory (MLPT) is a promising approach to compute finite temperature properties when the goal is to compare several different levels of ab initio theory and/or to apply highly expensive computational methods. Indeed, starting from a production molecular dynamics trajectory, this method can estimate properties at one or more target levels of theory from only a small number of additional fixed-geometry calculations, which are used to train a machine learning model. However, as MLPT is based on thermodynamic perturbation theory (TPT), inaccuracies might arise when the starting point trajectory samples a configurational space which has a small overlap with that of the target approximations of interest. By considering case studies of molecules adsorbed in zeolites and several different density functional theory approximations, in this work we assess the accuracy of MLPT for ensemble total energies and enthalpies of adsorption. It is shown that problematic cases can be detected even without knowing reference results and that even in these situations it is possible to recover target level results within chemical accuracy by applying a machine-learning-based Monte Carlo (MLMC) resampling. Finally, on the basis of the ideas developed in this work, we assess and confirm the accuracy of recently published MLPT-based enthalpies of adsorption at the random phase approximation level, whose high computational cost would completely hinder a direct molecular dynamics simulation.