Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not been explored to-date. In this report, we propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures. Although no physical descriptors for the considered components were used, our method outperforms the state-of-the-art method that has been refined over three decades while requiring much less training effort. This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures with the potential to revolutionize modeling and simulation in chemical engineering.
Activity Coefficients at Infinite Dilution
Solutes SolventsThis document is the unedited authors' version of a submitted work that was subsequently accepted for publication in TheIn this work, we describe a novel application of Machine Learning (ML) to the field of physical chemistry and thermodynamics: the prediction of physico-chemical properties of binary liquid mixtures by matrix completion. We focus on the prediction of a single property: the so-called activity coefficient, which is a measure of the non-ideality of a liquid mixture and of enormous relevance in practice. The interesting aspect of our approach is that no expert knowledge about the components that make up the mixture was used: all we needed was an incomplete, sparse data set of binary mixtures and their measured activity coefficients that our method was able to successfully complete. We show that this simple approach outperforms an established procedure that has been the state of the art for several decades.ML approaches to chemical and engineering sciences date back more than 50 years ago, but the genuine exploitation of the potential of ML in these fields has only recently begun 1 . An overview of recent advances in chemical and material sciences has, e.g., been given by Ramprasad et al. 2 and Butler et al. 3 ML has already been used to predict physico-chemical properties of mixtures, including activity coefficients 4-10 . Most of these approaches are basically quantitative structureproperty relationships (QSPR) methods 11 that use physical descriptors of the components as input data to characterize the considered mixtures and relate them to the property of interest by an ML algorithm, e.g., a neural network. However, the scope of these approaches is in general rather small.Binary mixtures are of fundamental importance in chemical engineering. The properties of mixtures can in general not be described based on properties of the pure components alone. If, however, the respective properties of the binary constituent 'sub-mixtures' of a multi-component mixture are known, the properties of the multi...