Ionic liquids (IL) are remarkable green solvents, which find applications in many areas of nano- and biotechnology including extraction and purification of value-added compounds or fine chemicals. These liquid salts...
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and...
An essential aspect for adequate predictions of chemical properties by machine learning models is the database used for training them. However, studies that analyze how the content and structure of the databases used for training impact the prediction quality are scarce. In this work, we analyze and quantify the relationships learned by a machine learning model (Neural Network) trained on five different reference databases (QM9, PC9, ANI-1E, ANI-1, and ANI-1x) to predict tautomerization energies from molecules in Tautobase. For this, characteristics such as the number of heavy atoms in a molecule, number of atoms of a given element, bond composition, or initial geometry on the quality of the predictions are considered. The results indicate that training on a chemically diverse database is crucial for obtaining good results and also that conformational sampling can partly compensate for limited coverage of chemical diversity. The overall best-performing reference database (ANI-1x) performs on average by 1 kcal/mol better than PC9, which, however, contains about 2 orders of magnitude fewer reference structures. On the other hand, PC9 is chemically more diverse by a factor of ∼5 as quantified by the number of atom-in-molecule-based fragments (amons) it contains compared with the ANI family of databases. A quantitative measure for deficiencies is the Kullback–Leibler divergence between reference and target distributions. It is explicitly demonstrated that when certain types of bonds need to be covered in the target database (Tautobase) but are undersampled in the reference databases, the resulting predictions are poor. Examples of this include the poor performance of all databases analyzed to predict C(sp2)–C(sp2) double bonds close to heteroatoms and azoles containing N–N and N–O bonds. Analysis of the results with a Tree MAP algorithm provides deeper understanding of specific deficiencies in predicting tautomerization energies by the reference datasets due to inadequate coverage of chemical space. Capitalizing on this information can be used to either improve existing databases or generate new databases of sufficient diversity for a range of machine learning (ML) applications in chemistry.
The spectroscopy and structural dynamics of a deep eutectic mixture (KSCN/acetamide) with varying water content is investigated from 2D IR (with the C–N stretch vibration of the SCN– anions as the reporter) and THz spectroscopy. Molecular dynamics simulations correctly describe the nontrivial dependence of both spectroscopic signatures depending on water content. For the 2D IR spectra, the MD simulations relate the steep increase in the cross-relaxation rate at high water content to the parallel alignment of packed SCN– anions. Conversely, the nonlinear increase of the THz absorption with increasing water content is mainly attributed to the formation of larger water clusters. The results demonstrate that a combination of structure-sensitive spectroscopies and molecular dynamics simulations provides molecular-level insights into the emergence of heterogeneity of such mixtures by modulating their composition.
<div> <div> <div> <p>Ionic liquids (IL) are remarkable green solvents, which find applications in many areas of nano- and biotechnology including extraction and purification of value-added compounds or fine chemicals. These liquid salts possess versatile solvation properties that can be tuned by modifications in the cation or anion structure. So far, in contrast to the great success of theoretical and computational methodologies applied to other fields, only a few IL models have been able to bring insights towards the rational design of such solvents. In this work, we develop coarse-grained (CG) models for imidazolium-based ILs using a new version of the Martini force field. The model is able to reproduce the main structural properties of pure ILs, including spatial heterogeneity and global densities over a wide range of temperatures. More importantly, given the high intermolecular compatibility of the Martini force field, this new IL CG model opens the possibility of large-scale simulations of liquid-liquid extraction experiments. As examples, we show two applications, namely the extraction of aromatic molecules from a petroleum oil model and the extraction of omega-3 polyunsaturated fatty acids from a fish oil model. In semi-quantitative agreement with the experiments, we show how the extraction capacity and selectivity of the IL could be affected by the cation chain length or addition of co-solvents. </p> </div> </div> </div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.