Machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise molecular representation derived from persistent homology, an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO 2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemicallydriven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.
High carbon emissions have shown a strong correlation with rising global temperatures as the world's climate undergoes a dramatic shift. Work to mitigate the potential damage using materials such as metal−organic frameworks (MOFs), covalent organic frameworks (COFs), and polymer membranes (PMs) has proven successful in small-scale approaches; however, research is still being performed to enhance the capabilities of these materials for use at an industrial scale. One strategy for increasing performance is to embed these materials with CO 2 -philic molecules, which enhance selective binding over other gases. Calixarenes are promising candidates due to their large chalice shape, which allows for the possibility to bind multiple CO 2 molecules per site. In this study, a dataset including 40 functionalized calixarene structures and one unfunctionalized (bare) calixarene was constructed with an automated, high-throughput structure generation through directed modifications to a molecular scaffold. A conformational search based on molecular mechanics allowed the faster determination of optimal binding energies for a vast array of chemical functional groups with less computational effort. Density functional theory and symmetry-adapted perturbation theory calculations were performed for the exploration of their interactions with CO 2 . Our work has identified new organic cages with increased CO 2 -philicity. In four cases, CO 2 binding is stronger than 9.0 kcal/mol and very close to the targets set by previous studies. The nature of the noncovalent interactions for these cases is analyzed and discussed. Conclusions from this study can aid synthetic efforts for the next generation of functional materials.
In the original version of this article, some text was missing from the legends of Figure 5 and Figure 6. The legend of Figure 5 originally read: "CO 2 interaction energy distribution shown as horizontal violin plots for the first, second, and third active-learning steps. The height of the shape shows the frequency of occurrences.." The correct version states: "CO 2 interaction energy distribution shown as horizontal violin plots for the first, second, and third active-learning steps. a CM, b BoB, c SOAP, and d PI. The height of the shape shows the frequency of occurrences…" This has been corrected in both the PDF and HTML version of the article. The legend of Figure 6 originally read: "Predicted CO 2 and N 2 interaction energies (in kcal mol −1) for all molecules in the GDB-9 database using four molecular representation models. Only the model that utilized the PI molecular representation …" The correct version states: "Predicted CO 2 and N 2 interaction energies (in kcal mol −1) for all molecules in the GDB-9 database using four molecular representation models. a CM, b BoB, c SOAP, and d PI. Only the model that utilized the PI molecular representation." This has been corrected in both the PDF and HTML version of the article.
<p>Developing alternative strategies for efficient separation of CO2 and N2 is of general interest for the reduction of anthropogenic carbon emissions. In recent years, machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise and size-consistent molecular representation derived from persistent homology,an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemically-driven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.</p>
<p>Developing alternative strategies for efficient separation of CO2 and N2 is of general interest for the reduction of anthropogenic carbon emissions. In recent years, machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise and size-consistent molecular representation derived from persistent homology,an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemically-driven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.