Knowledge of the glass transition temperature of molecular compounds that occur in atmospheric aerosol particles is important for estimating their viscosity, as it directly influences the kinetics of chemical reactions and particle phase state. While there is a great diversity of organic compounds present in aerosol particles, for only a minor fraction of them experimental glass transition temperatures are known. Therefore, we have developed a machine learning model designed to predict the glass transition temperature of organic molecular compounds based on molecule-derived input variables. The extremely randomized trees (extra trees) procedure was chosen for this purpose. Two approaches using different sets of input variables were followed. The first one uses the number of selected functional groups present in the compound, while the second one generates descriptors from a SMILES (Simplified Molecular Input Line Entry System) string. Organic compounds containing carbon, hydrogen, oxygen, nitrogen, and halogen atoms are included. For improved results, both approaches can be combined with the melting temperature of the compound as an additional input variable. The results show that the predictions of both approaches show a similar mean absolute error of about 12–13 K, with the SMILES-based predictions performing slightly better. In general, the model shows good predictive power considering the diversity of the experimental input data. Furthermore, we also show that its performance exceeds that of previous parameterizations developed for this purpose and also performs better than existing machine learning models. In order to provide user-friendly versions of the model for applications, we have developed a web site where the model can be run by interested scientists via a web-based interface without prior technical knowledge. We also provide Python code of the model. Additionally, all experimental input data are provided in form of the Bielefeld Molecular Organic Glasses (BIMOG) database. We believe that this model is a powerful tool for many applications in atmospheric aerosol science and material science.
<p>Knowledge of the glass transition temperature of molecular compounds in atmospheric aerosol particles is important for estimating their viscosity, which directly influences chemical reaction kinetics and phase state. While there is a great diversity of organic compounds present in aerosol particles, experimental glass transition temperatures are known of only a minor fraction of them. Therefore, we have developed a machine learning model in Python designed to predict the glass transition temperature of organic molecular compounds based on molecule-derived input variables. The extremely randomized trees (extra trees) procedure was chosen for this objective. Two approaches using different sets of input variables were followed. The first one uses the number of predefined functional groups present in the compound, while the second one generates descriptors from a SMILES (Simplified Molecular Input Line Entry System) string. For improved results both approaches can be combined with the melting temperature of the compound as an additional input variable, if known. The results show that the SMILES-based predictions had a slightly lower mean absolute error (MAE), but both approaches had a similar MAE of about 12-13 K. Furthermore, we also show that its performance exceeds that of previous parametrizations developed of this purpose and performs better than existing machine learning models. We believe that this model is a powerful tool for many applications in atmospheric aerosol science and material science.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.