Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery.
Generative models have been successfully used to synthesize completely novel images, text, music, and speech. As such, they present an exciting opportunity for the design of new materials for functional applications. So far, generative deep-learning methods applied to molecular and drug discovery have yet to produce stable and novel 3-D crystal structures across multiple material classes. To that end, we, herein, present an autoencoder-based generative deep-representation learning pipeline for geometrically optimized 3-D crystal structures that simultaneously predicts the values of eight target properties. The system is highly general, as demonstrated through creation of novel materials from three separate material classes: binary alloys, ternary perovskites, and Heusler compounds. Comparison of these generated structures to those optimized via electronic-structure calculations shows that our generated materials are valid and geometrically optimized.
The ever-growing abundance of data found in heterogeneous sources, such as scientific publications, has forced the development of automated techniques for data extraction. While in the past, in the physical sciences domain, the focus has been on the precise extraction of individual properties, attention has recently been devoted to the extraction of higher-level relationships. Here, we present a framework for an automated population of ontologies. That is, the direct extraction of a larger group of properties linked by a semantic network. We exploit data-rich sources, such as tables within documents, and present a new model concept that enables data extraction for chemical and physical properties with the ability to organize hierarchical data as nested information. Combining these capabilities with automatically generated parsers for data extraction and forward-looking interdependency resolution, we illustrate the power of our approach via the automatic extraction of a crystallographic hierarchy of information. This includes 18 interrelated submodels of nested data, extracted from an evaluation set of scientific articles, yielding an overall precision of 92.2%, across 26 different journals. Our method and associated toolkit, ChemDataExtractor 2.0, offers a key step toward the seamless integration of primary literature sources into a data-driven scientific framework.
Predicting the properties of materials prior to their synthesis is of great importance in materials science. Magnetic and superconducting materials exhibit a number of unique properties that make them useful in a wide variety of applications, including solid oxide fuel cells, solid-state refrigerants, photon detectors and metrology devices. In all these applications, phase transitions play an important role in determining the feasibility of the materials in question. Here, we present a pipeline for fully integrating data extracted from the scientific literature into machine-learning tools for property prediction and materials discovery. Using advanced natural language processing (NLP) and machine-learning techniques, we successfully reconstruct the phase diagrams of well-known magnetic and superconducting compounds, and demonstrate that it is possible to predict the phase-transition temperatures of compounds not present in the database. We provide the tool as an online open-source platform, forming the basis for further research into magnetic and superconducting materials discovery for potential device applications.npj Computational Materials (2020) 6:18 ; https://doi.
Magnetic materials play an important role in a wide variety of everyday applications, and they are critical components in many devices used for energy conversion. However, there are very few materials known to exhibit magnetism of any kind, and the slow process of experimentally driven magnetic-materials discovery has limited the development of devices for functional applications. In this work, a complete magnetic-materials discovery pipeline is presented that uses natural language processing (NLP), machine learning, and generative models to predict ferromagnetic compounds in the Heusler alloy family. Using the “chemistry-aware” NLP toolkit, ChemDataExtractor, a database of 2910 magnetocaloric compounds is autogenerated by sourcing from the scientific literature. These data are then used to train property-prediction models for key figures of merit that describe the magnetocaloric effect. The predictive models are applied to novel Heusler alloy material candidates that have been created using deep generative representation learning. Convex-hull meta-stability analysis and ab initio validation of these candidates identify six potential materials for solid-state refrigeration applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.