The availability and easy access of large-scale experimental and computational materials data have enabled the emergence of accelerated development of algorithms and models for materials property prediction, structure prediction, and generative design of materials. However, the lack of user-friendly materials informatics web servers has severely constrained the wide adoption of such tools in the daily practice of materials screening, tinkering, and design space exploration by materials scientists. Herein we first survey current materials informatics web apps and then propose and develop MaterialsAtlas.org, a web-based materials informatics toolbox for materials discovery, which includes a variety of routinely needed tools for exploratory materials discovery, including material’s composition and structure validity check (e.g. charge neutrality, electronegativity balance, dynamic stability, Pauling rules), materials property prediction (e.g. band gap, elastic moduli, hardness, and thermal conductivity), search for hypothetical materials, and utility tools. These user-friendly tools can be freely accessed at http://www.materialsatlas.org. We argue that such materials informatics apps should be widely developed by the community to speed up materials discovery processes.
Fast and accurate crystal structure prediction (CSP) algorithms and web servers are highly desirable for the exploration and discovery of new materials out of the infinite chemical design space. However, currently, the computationally expensive first-principles calculation-based CSP algorithms are applicable to relatively small systems and are out of reach of most materials researchers. Several teams have used an element substitution approach for generating or predicting new structures, but usually in an ad hoc way. Here we develop a template-based crystal structure prediction (TCSP) algorithm and its companion web server, which makes this tool accessible to all materials researchers. Our algorithm uses elemental/chemical similarity and oxidation states to guide the selection of template structures and then rank them based on the substitution compatibility and can return multiple predictions with ranking scores in a few minutes. A benchmark study on the 98290 formulas of the Materials Project database using leave-one-out evaluation shows that our algorithm can achieve high accuracy (for 13145 target structures, TCSP predicted their structures with root-mean-square deviation < 0.1) for a large portion of the formulas. We have also used TCSP to discover new materials of the Ga–B–N system, showing its potential for high-throughput materials discovery. Our user-friendly web app TCSP can be accessed freely at on our MaterialsAtlas.org web app platform.
Performing first principle calculations to discover electrodes' properties in the large chemical space is a challenging task. While machine learning (ML) has been applied to effectively accelerate those discoveries, most of the applied methods ignore the materials' spatial information and only use pre-defined features: based only on chemical compositions. We propose two attention-based graph convolutional neural network techniques to learn the average voltage of electrodes. Our proposed method, which combines both atomic composition and atomic coordinates in 3D-space, improves the accuracy in voltage prediction by 17% when compared to composition based ML models. The first model directly learns the chemical reaction of electrodes and metal-ions to predict their average voltage, whereas the second model combines electrodes' ML predicted formation energy (E form ) to compute their average voltage. Our models demonstrates improved accuracy in transferability from our subset of learned metal-ions to other metal-ions.
Pre-trained transformer language models on large unlabeled corpus have produced state-of-the-art results in natural language processing, organic molecule design, and protein sequence generation. However, no such models have been applied to learn the composition patterns for generative design of material compositions. Here we train a series of seven modern transformer models (GPT, GPT-2, GPT-Neo, GPT-J, BLMM, BART, and RoBERTa) for materials design using the expanded formulas of the ICSD, OQMD, and Materials Projects databases. Six different datasets with/out non-charge-neutral or balanced electronegativity samples are used to benchmark the generative design performances and uncover the biases of modern transformer models for the generative design of materials compositions. Our experiments show that the materials transformers based on causal language models can generate chemically valid materials compositions with as high as 97.54\% to be charge neutral and 91.40\% to be electronegativity balanced, which has more than six times higher enrichment compared to the baseline pseudo-random sampling algorithm. Our language models also demonstrate high generation novelty and their potential in new materials discovery is proved by their capability to recover the leave-out materials. We also find that the properties of the generated compositions can be tailored by training the models with selected training sets such as high-bandgap samples. Our experiments also show that different models each have their own preference in terms of the properties of the generated samples and their running time complexity varies a lot. We have applied our materials transformers to discover a set of new materials as validated using DFT calculations. All our trained materials transformer models and code can be accessed freely at \url{http://www.github.com/usccolumbia/MTransformer}.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.