Looking through glass: Knowledge discovery from materials science literature using natural language processing Highlights d Natural language processing is used for information extraction from research papers d Caption cluster plots are used for exploring figure captions across the entire corpus d Elemental maps are used to identify the chemical elements reported in a study d A framework to extract domain-specific queries from the literature
The progress of human civilization has always been closely associated with the discovery of new materials. This is probably why the tripartite classification of historical periods is also based on materials-stone, bronze, and iron age.Beyond these materials, there are several others which have significantly improved the quality of human life, namely, steel, aluminum, glass, plastics, the latest in the list being nanomaterials. Among these materials, glasses hold a unique place in human lives, considering their applications ranging from everyday glass utensils and kitchen-wares to
Due to their excellent optical properties, glasses are used for various applications ranging from smartphone screens to telescopes. Developing compositions with tailored Abbe number (Vd) and refractive index at 587.6 nm (nd), two crucial optical properties, is a major challenge. To this extent, machine learning (ML) approaches have been successfully used to develop composition–property models. However, these models are essentially black boxes in nature and suffer from the lack of interpretability. In this paper, we demonstrate the use of ML models to predict the composition‐dependent variations of Vd and nd. Further, using Shapely additive explanations (SHAP), we interpret the ML models to identify the contribution of each of the input components toward target prediction. We observe that glass formers such as SiO2, B2O3, and P2O5 and intermediates such as TiO2, PbO, and Bi2O3 play a significant role in controlling the optical properties. Interestingly, components contributing toward increasing the nd are found to decrease the Vd and vice versa. Finally, we develop the Abbe diagram, using the ML models, allowing accelerated discovery of new glasses for optical properties beyond the experimental pareto front. Overall, employing explainable ML, we predict and interpret the compositional control on the optical properties of oxide glasses.
Data-driven synthesis planning with machine learning is a key step in the design and discovery of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by chemists' prior knowledge and experience, built upon experimental trial-and-error that is both time and resource consuming. Recent developments in natural language processing (NLP) have enabled large-scale text mining of scientific literature, providing open source databases of synthesis information of synthesized compounds, material precursors, and reaction conditions (temperatures, times). In this work, we employ a conditional variational autoencoder (CVAE) to predict suitable inorganic reaction conditions for the crucial inorganic synthesis steps of calcination and sintering. We find that the CVAE model is capable of learning subtle differences in target material composition, precursor compound identities, and choice of synthesis route (solid-state, sol-gel) that are present in the inorganic synthesis space. Moreover, the CVAE can generalize well to unseen chemical entities and shows promise for predicting reaction conditions for previously unsynthesized compounds of interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.