Separating
xylene isomers is vital in the petrochemical
industry,
yet it poses a considerable challenge due to their proximate boiling
points, mandating selective adsorbents. This work utilizes active
learning (AL) coupled with molecular simulations to rapidly screen
324,426 hypothetical metal–organic frameworks (hMOFs) to identify
optimal materials for preferential para-xylene (pX)
adsorption. To begin, a diverse subset, representative of the entire
hMOF set, was curated using structural and chemical descriptors and
evaluated through multiple screening methodologies. This comparative
analysis highlighted the superior efficiency of AL in targeted screening
processes, requiring on an average only 500 multicomponent Grand Canonical
Monte Carlo simulations to identify the most pX-selective framework,
encompassing 50.5% of the top 100 candidates. With an equivalent evaluation
budget, both machine learning (ML) and evolutionary algorithms demonstrate
an inadequate performance. While the former consistently fails to
identify top performers, the latter continuously identifies significantly
inferior materials. AL, on the other hand, surpasses rival approaches
by effectively balancing exploration and exploitation, guiding simulations
toward regions associated with high performance. Furthermore, we report
the impact of different surrogate models, acquisition functions, and
batch acquisition strategies on the convergence of our AL model. We
found that the Gaussian process surrogate model coupled with expected
improvement (EI) acquisition function and the Kriging-Believer upper
bound (KBUB) acquisition strategy acquires the highest pX-selective
MOF in just 86 acquisitions. Examining the top hMOF candidates revealed
a complex correlation between the pX selectivity and structural features
of hMOFs. In particular, the pcu topology, along
with a pore size ranging from 5 to 6 Å, emerged as the dominant
characteristic of top hMOFs. Furthermore, pressure-dependent simulations
revealed optimal pressure maximizing pX uptake and selectivity. This
computational workflow, integrating AL and molecular simulations,
shows promise in accelerating data-driven material innovation for
separation applications.