2022
DOI: 10.1038/s41598-022-23431-2
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty-aware mixed-variable machine learning for materials design

Abstract: Data-driven design shows the promise of accelerating materials discovery but is challenging due to the prohibitive cost of searching the vast design space of chemistry, structure, and synthesis methods. Bayesian optimization (BO) employs uncertainty-aware machine learning models to select promising designs to evaluate, hence reducing the cost. However, BO with mixed numerical and categorical variables, which is of particular interest in materials design, has not been well studied. In this work, we survey frequ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 45 publications
0
8
0
Order By: Relevance
“…To explore whether greater acceleration can be achieved by employing a mixed-search space BO strategy, we designed a set of simulated learning campaigns that combine our BO approach with one-hot encoding. One-hot encoding is an approach that assigns an integer representation for each category where no ordinal relationship exists between them and the similarity between any two categories is assumed to be equal 38,39 . Given these considerations and keeping the independence and similarity between the lattice design families, we selected one-hot encoding as a method to reparametrize our feature space (Figure S5).…”
Section: Bayesian Optimization To Explore the Family Of Lattice Designsmentioning
confidence: 99%
“…To explore whether greater acceleration can be achieved by employing a mixed-search space BO strategy, we designed a set of simulated learning campaigns that combine our BO approach with one-hot encoding. One-hot encoding is an approach that assigns an integer representation for each category where no ordinal relationship exists between them and the similarity between any two categories is assumed to be equal 38,39 . Given these considerations and keeping the independence and similarity between the lattice design families, we selected one-hot encoding as a method to reparametrize our feature space (Figure S5).…”
Section: Bayesian Optimization To Explore the Family Of Lattice Designsmentioning
confidence: 99%
“…In general, uncertainty-based acquisition can be conducted based on either frequentist approaches, e.g., random forests and deep neural networks, or Bayesian approaches, e.g., GPs and generalized linear models. For practical guidance on which to employ, readers are referred to Zhang et al [141] Meanwhile, diversity is frequently modeled as a pair-wise, model-agnostic metric that involves a mapping from a pair of instances to a scalar similarity. [135] By harnessing the pair-wise kernel trick, the diversity-based acquisition is capable of handling high-dimensional input instances.…”
Section: Perspectives On Acquisition Strategymentioning
confidence: 99%
“…Sequential Acquisition for Generic Use: Uncertainty vs Diversity: Sequential acquisition can be thought of as designing the rules with which to query an existing pool of unlabeled data, which, in our review, is typically a large number of unit cells with unknown properties. Here, we discuss and compare two key approaches to acquisition for DMD: uncertainty-based sampling, [140,141] and diversity-based sampling. [65,85,86,132] Uncertainty-based sampling is centered on improving the prediction confidence of a model, typically resulting in a distributional imbalance that poorly represents the distribution of unlabeled data.…”
Section: Perspectives On Acquisition Strategymentioning
confidence: 99%
“…The basic procedure of the new method is shown in Figure 3. It is based upon the adaptive sampling method proposed in [8], but modified for the usage with FRPs. This new type of DOE -in this respect similar to the pearl-string method -adaptively selects the next design point based on the experiments already performed [3,5].…”
Section: Overviewmentioning
confidence: 99%
“…Figure3: Usage of Gaussian process regression for increasing efficiency in design of experiments (based on[8]). …”
mentioning
confidence: 99%