Application
of machine learning (ML) methods for the determination
of the gas adsorption capacities of nanomaterials, such as metal–organic
frameworks (MOF), has been extensively investigated over the past
few years as a computationally efficient alternative to time-consuming
and computationally demanding molecular simulations. Depending on
the thermodynamic conditions and the adsorbed gas, ML has been found
to provide very accurate results. In this work, we go one step further
and we introduce chemical intuition in our descriptors by using the
“type” of the atoms in the structure, instead of the
previously used building blocks, to account for the chemical character
of the MOF. ML predictions for the methane and carbon dioxide adsorption
capacities of several tens of thousands of hypothetical MOFs are evaluated
at various thermodynamic conditions using the random forest algorithm.
For all cases examined, the use of atom types instead of building
blocks leads to significantly more accurate predictions, while the
number of MOFs needed for the training of the ML algorithm in order
to achieve a specified accuracy can be reduced by an order of magnitude.
More importantly, since practically there are an unlimited number
of building blocks that materials can be made of but a limited number
of atom types, the proposed approach is more general and can be considered
as universal. The universality and transferability was proved by predicting
the adsorption properties of a completely different family of materials
after the training of the ML algorithm in MOFs.
In the present study,
we propose a new set of descriptors that, along with a few structural
features of nanoporous materials, can be used by machine learning
algorithms for accurate predictions of the gas uptake capacities of
these materials. All new descriptors closely resemble the helium atom
void fraction of the material framework. However, instead of a helium
atom, a particle with an appropriately defined van der Waals radius
is used. The set of void fractions of a small number of these particles
is found to be sufficient to characterize uniquely the structure of
each material and to account for the most important topological features.
We assess the accuracy of our approach by examining the predictions
of the random forest algorithm in the relative small dataset of the
computation-ready, experimental (CoRE) MOFs (∼4700 structures)
that have been experimentally synthesized and whose geometrical/structural
features have been accurately calculated before. We first performed
grand canonical Monte Carlo simulations to accurately determine their
methane uptake capacities at two different temperatures (280 and 298
K) and three different pressures (1, 5.8, and 65 bar). Despite the
high chemical and structural diversity of the CoRE MOFs, it was found
that the use of the proposed descriptors significantly improves the
accuracy of the machine learning algorithm, particularly at low pressures,
compared to the predictions made based solely on the rest structural
features. More importantly, the algorithm can be easily adapted for
other types of nanoporous materials beyond MOFs. Convergence of the
predictions was reached even for small training set sizes compared
to what was found in previous works using the hypothetical MOF database.
In the present study, we propose a new set of descriptors, appropriate for machine learning (ML) methods, aiming to predict accurately the gas adsorption capacities of nanoporous materials. The present work focuses on systems with nonnegligible electrostatic interactions between the materials' framework and the guest gas. For that, the CO 2 , H 2 , and H 2 S gases are examined. The present approach is a generalization of our recent development for guest gases with no electrostatic interactions, such as CH 4 . For both types of systems, as ML descriptors we consider the adsorption probabilities by the materials' framework of a small number of probe atoms with different van der Waals diameters. After examination and evaluation of various numerical schemes, probe atoms that carry in their centers an electric dipole are found to be the most appropriate for systems with electrostatic interactions. The accuracy of the present approach is assessed by comparing the ML predictions with a data set of reference results obtained after performing grand canonical Monte Carlo (GCMC) simulations. More specifically, the CO 2 , H 2 , and H 2 S adsorption capacities of the computation-ready, experimental (CoRE) MOFs at several different thermodynamic conditions are considered. The low computational cost for the calculation of the proposed set of ML descriptors allows the screening of very large databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.