2022
DOI: 10.1021/acsami.1c24003
|View full text |Cite
|
Sign up to set email alerts
|

Simple Structural Descriptor Obtained from Symbolic Classification for Predicting the Oxygen Vacancy Defect Formation of Perovskites

Abstract: Symbolic classification is an approach of interpretable machine learning for building mathematical formulas that fit certain data sets. In this work, symbolic classification is used to establish the relationship between oxygen vacancy defect formation energy and structural features. We find a structural descriptor n a (r a/E na – r b ), where n a is the valence of the a-site ion, r a is the radius of the a-site ion, E na is the electronegativity of the a-site ion, and r b is the radius of the b-site ion. It … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 47 publications
0
5
0
Order By: Relevance
“…In general, a high ratio of sample size to feature dimension would lead to better model performance. When the existing features do not contain enough valid information to cause low model performance, new features can be either constructed based on domain knowledge or generated by simple mathematical transformation of existing features through algorithms such as the Sure Independence Screening Sparsifying Operator (SISSO) and genetic algorithm (GA) to improve model performance [27,28]. The properties of materials are influenced by their composition, structure, experimental conditions, and environmental factors, but there may be weakly correlated, uncorrelated, or redundant features in the data.…”
Section: Workflow Of Materials Machine Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…In general, a high ratio of sample size to feature dimension would lead to better model performance. When the existing features do not contain enough valid information to cause low model performance, new features can be either constructed based on domain knowledge or generated by simple mathematical transformation of existing features through algorithms such as the Sure Independence Screening Sparsifying Operator (SISSO) and genetic algorithm (GA) to improve model performance [27,28]. The properties of materials are influenced by their composition, structure, experimental conditions, and environmental factors, but there may be weakly correlated, uncorrelated, or redundant features in the data.…”
Section: Workflow Of Materials Machine Learningmentioning
confidence: 99%
“…Some researchers had used a particular feature selection method as a tool to determine whether the initial feature subset was valid and then taken other measures to construct other, more useful features. Liu et al [28] collected 3430 samples to predict the formation of the oxygen vacancy defect in perovskites. The target variable is the oxygen vacancy formation energy, which is defined as a dichotomous problem of whether an oxygen vacancy defect is likely to form or not by using 0.5 eV as the cutoff, and the initial features are 16 structural parameters containing ionic radius, ionic chemical valence, electronegativity, lattice parameters, tolerance factor, and octahedral factor.…”
Section: Feature Selection For Inorganic Perovskitesmentioning
confidence: 99%
“…They tested several regression methods and identified the best descriptors to be the difference in Pauling electronegativity between the oxygen and its nearest neighbor cation and the fraction of valence electrons in the material belonging to oxygen. Leveraging instead the data set of around 300 ΔE vf calculations of perovskite oxides by Emery et al, 20 Liu et al 40 also aimed to identify simple features not requiring any DFT calculation, considering elemental properties and proposing a descriptor composed of cation valence, electronegativity, and atomic radii. Recently, Wexler et al 24 introduced a description of ΔE vf as a linear combination of DFT defect-free stability, band gap, reduction energy of the metal cations neighboring the vacancy, and bond strength between the oxygen and its neighboring metal cations.…”
Section: ■ Introductionmentioning
confidence: 99%
“…It avoids imposing previous assumptions and infers the model from the data, whereas conventional regression approaches aim to optimize the parameters for a pre-specified model structure. Hoist mutation, subtree mutation and crossover mutation are common operation to achieve an appropriate genetic model [55]. Ouyang et al, proposed a SISSO-based symbolic regression approach for discovering descriptors for material properties, which is based on the compressed-sensing dimensionality reduction [56].…”
Section: Introductionmentioning
confidence: 99%
“…This approach has been widely applied for descriptor design in many applications (e.g. to design simple structural descriptor for oxygen vacancy defect formation [55]), and is demonstrated to be an interpretable machine learning method to accelerate the discovery of perovskite catalysts that can be experimentally verified [57].…”
Section: Introductionmentioning
confidence: 99%