Accelerated search for materials with targeted properties by adaptive design

Xue, Dezhen; Balachandran, Prasanna V.; Hogden, John; Theiler, James

doi:10.1038/ncomms11241

Cited by 656 publications

(466 citation statements)

References 35 publications

Supporting

Mentioning

457

Contrasting

Unclassified

Order By: Relevance

“…The ramifications of this observation deserve special emphasis: we suggest that ML models (and indeed, possibly other types of models in materials science) are more useful as guides for an iterative sequence of experiments, as opposed to single-shot screening tools that can reliably evaluate an entire search space once and shortlist high-performing materials. Laboratory discoveries reported in Xue et al 12 and Ren et al 31 reinforce the efficacy of such an iterative, data-driven approach.…”

Section: Materials Informatics (Mi)mentioning

confidence: 76%

“…Ideally, the result of training such models would be the experimental realization of new materials with promising properties. The MI community has produced several such success stories, including thermoelectric compounds, 10,11 shapememory alloys, 12 superalloys, 13 and 3d-printable high-strength aluminum alloys. 14 However, in many cases, a model is itself the output of a study, and the question becomes: to what extent could the model be used to drive materials discovery?…”

Section: Materials Informatics (Mi)mentioning

confidence: 99%

See 1 more Smart Citation

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Meredig

Antono

Church

et al. 2018

Mol. Syst. Des. Eng.

210

180

View full text Add to dashboard Cite

Traditional machine learning (ML) metrics overestimate model performance for materials discovery. We introduce (1) leave-onecluster-out cross-validation (LOCO CV) and (2) a simple nearestneighbor benchmark to show that model performance in discovery applications strongly depends on the problem, data sampling, and extrapolation. Our results suggest that ML-guided iterative experimentation may outperform standard high-throughput screening for discovering breakthrough materials like high-T c superconductors with ML.Materials informatics (MI), or the application of data-driven algorithms to materials problems, has grown quickly as a field in recent years. 9 Across all of these applications, a training database of simulated or experimentally-measured materials properties serves as input to a ML algorithm that predictively maps features (i.e., materials descriptors) to target materials properties. Ideally, the result of training such models would be the experimental realization of new materials with promising properties. The MI community has produced several such success stories, including thermoelectric compounds, 10,11 shapememory alloys, 12 superalloys, 13 and 3d-printable high-strength aluminum alloys. 14 However, in many cases, a model is itself the output of a study, and the question becomes: to what extent could the model be used to drive materials discovery? Typically, the performance of ML models of materials properties is quantified via cross-validation (CV). CV can be performed either in a single division of the available data into a training set (to build the model) and a test set (to evaluate its performance), or as an ensemble process known as k-fold CV wherein the data are partitioned into k nonoverlapping subsets of nearly equal size (folds) and model performance is averaged across each combination of k-1 training folds and one test fold. Leave-one-out crossvalidation (LOOCV) is the limit where k is the number of total examples in the dataset. Table 1 summarizes some examples of model performance statistics as reported in the aforementioned studies (some studies involved testing multiple algorithms across multiple properties).In Table 1, the reported model performance is uniformly excellent across all studies. A tempting conclusion is that any of these models could be used for one-shot high-throughput screening of large numbers of materials for desired properties. However, as we discuss below, traditional CV has critical shortcomings in terms of quantifying ML model performance for materials discovery. Issues with traditional crossvalidation for materials discoveryMany ML benchmark problems consist of data classification into discrete bins, i.e., pattern matching. For example, the Design, System, ApplicationMachine learning (ML) has become a widely-adopted predictive tool for materials design and discovery. Random k-fold cross-validation (CV), the traditional gold-standard approach for evaluating the quality of ML models, is fundamentally mismatched to the nature of materials discovery, and leads to ...

show abstract

Section: Materials Informatics (Mi)mentioning

confidence: 76%

Section: Materials Informatics (Mi)mentioning

confidence: 99%

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Meredig

Antono

Church

et al. 2018

Mol. Syst. Des. Eng.

210

180

View full text Add to dashboard Cite

show abstract

“…However, these approaches have largely been data-driven with materials knowledge used to construct features. Although some of the recent adaptive strategies can potentially overcome certain shortcomings (6,29), a principled approach requires integrating prior information with available data within a Bayesian framework. Few attempts, if any, in the materials literature incorporate theory within a Bayesian formalism to constrain the model outcomes.…”

Section: Temperaturementioning

confidence: 99%

Accelerated search for BaTiO ₃ -based piezoelectrics with vertical morphotropic phase boundary using Bayesian learning

Xue

Balachandran

Yuan

et al. 2016

Proc. Natl. Acad. Sci. U.S.A.

Self Cite

137

View full text Add to dashboard Cite

An outstanding challenge in the nascent field of materials informatics is to incorporate materials knowledge in a robust Bayesian approach to guide the discovery of new materials. Utilizing inputs from known phase diagrams, features or material descriptors that are known to affect the ferroelectric response, and Landau-Devonshire theory, we demonstrate our approach for BaTiO 3 -based piezoelectrics with the desired target of a vertical morphotropic phase boundary. We predict, synthesize, and characterize a solid solution, (Ba 0.5 Ca 0.5 )TiO 3 -Ba(Ti 0.7 Zr 0.3 )O 3 , with piezoelectric properties that show better temperature reliability than other BaTiO 3 -based piezoelectrics in our initial training data.piezoelectric materials | materials informatics | Bayesian learning | morphotropic phase boundary | Pb-free materials A ccelerating the process of materials design and discovery is an emerging theme in materials science (1). The emphasis has so far largely been on screening databases or using datadriven approaches that infer predictions directly from the data, be it from high-throughput calculations or experimental measurements (2-6). However, a distinguishing aspect of materials science is that in addition to data there exists a substantial body of knowledge in the form of phenomenological models and physical theories that could be used to constrain the inference models. Hence, a key challenge in materials informatics is to incorporate knowledge to make predictions that are more robust than would be possible by using data alone. Although such knowledge is used in choosing features or descriptors for materials informatics (7-9), it has seldom been used to encode prior information in the form of probability distributions and uncertainties for predicting novel materials with desired properties. Bayesian inference, which permits integration of prior knowledge or beliefs with the observed data, has shown considerable promise in cancer genomics (10) using metabolic pathway information, and in systems biology (11), but has been little explored in materials science. Our objective is to combine empirical data and materials knowledge within a Bayesian approach coupled to the results of Landau-Devonshire theory (12, 13) to design better BaTiO3-based lead-free piezoelectrics.Piezoelectric materials, such as the solid solutions of BaTiO3, are best suited for exploring Bayesian inference methods because historically they are well modeled by Landau-Devonshire theory (12-14) and equations exist for describing some of the key characteristics that determine the functional response, such as the morphotropic phase boundary (MPB) (15, 16). These equations serve as "constraints" that encode prior knowledge within our Bayesian formalism. Furthermore, BaTiO3-based solid solutions represent an important class of potential substitutes for Pb-based materials, which suffer from environmental concerns. Akin to the Pb-based piezoelectrics, MPBs can be established in BaTiO3-based solid solutions that enable polarization and structural inst...

show abstract

“…Using an example of perovskites, we show how this information can be utilized to discover empirical rules for materials design. Machine learning (ML) methods are becoming increasingly popular in accelerating the design of new materials by predicting material properties with accuracy close to ab initio calculations, but with computational speeds orders of magnitude faster [1][2][3]. The arbitrary size of crystal systems poses a challenge as they need to be represented as a fixed length vector in order to be compatible with most ML algorithms.…”

mentioning

confidence: 99%

“…The arbitrary size of crystal systems poses a challenge as they need to be represented as a fixed length vector in order to be compatible with most ML algorithms. This problem is usually resolved by manually constructing fixed length feature vectors using simple material properties [1,[3][4][5][6] or designing symmetry-invariant transformations of atom coordinates [7][8][9]. However, the former requires a case-by-case design for predicting different properties, and the latter makes it hard to interpret the models as a result of the complex transformations.…”

mentioning

confidence: 99%

Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties

2018

View full text Add to dashboard Cite

The use of machine learning methods for accelerating the design of crystalline materials usually requires manually constructed feature vectors or complex transformation of atom coordinates to input the crystal structure, which either constrains the model to certain crystal types or makes it difficult to provide chemical insights. Here, we develop a crystal graph convolutional neural networks (CGCNN) framework to directly learn material properties from the connection of atoms in the crystal, providing a universal and interpretable representation of crystalline materials. Our method provides a highly accurate prediction of DFT calculated properties for 8 different properties of crystals with various structure types and compositions after trained with 10 4 data points. Further, our framework is interpretable because one can extract the contributions from local chemical environments to global properties. Using an example of perovskites, we show how this information can be utilized to discover empirical rules for materials design.

show abstract

Accelerated search for materials with targeted properties by adaptive design

Cited by 656 publications

References 35 publications

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Accelerated search for BaTiO ₃ -based piezoelectrics with vertical morphotropic phase boundary using Bayesian learning

Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties

Contact Info

Product

Resources

About

Accelerated search for materials with targeted properties by adaptive design

Cited by 656 publications

References 35 publications

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Accelerated search for BaTiO 3 -based piezoelectrics with vertical morphotropic phase boundary using Bayesian learning

Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties

Contact Info

Product

Resources

About

Accelerated search for BaTiO ₃ -based piezoelectrics with vertical morphotropic phase boundary using Bayesian learning