2023
DOI: 10.1063/5.0138913
|View full text |Cite
|
Sign up to set email alerts
|

ET-AL: Entropy-targeted active learning for bias mitigation in materials data

Abstract: Growing materials data and data-driven informatics drastically promote the discovery and design of materials. While there are significant advancements in data-driven models, the quality of data resources is less studied despite its huge impact on model performance. In this work, we focus on data bias arising from uneven coverage of materials families in existing knowledge. Observing different diversities among crystal systems in common materials databases, we propose an information entropy-based metric for mea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 54 publications
0
7
0
Order By: Relevance
“…For example, an investigation of possible high-entropy alloys found wide disparities in compound space group (Figure b) . Even knowledge of what materials can be made is limited, as many observed materials are metastable, and computational thermochemistry data sets have biased distributions of formation energies for different structure types . Furthermore, nearly all experimental and computational data consider low-pressure systems, yet chemical bonding and periodic trends are radically different at high pressures relevant to materials under extreme conditions .…”
Section: The Challenge Of the Exceptionalmentioning
confidence: 99%
See 2 more Smart Citations
“…For example, an investigation of possible high-entropy alloys found wide disparities in compound space group (Figure b) . Even knowledge of what materials can be made is limited, as many observed materials are metastable, and computational thermochemistry data sets have biased distributions of formation energies for different structure types . Furthermore, nearly all experimental and computational data consider low-pressure systems, yet chemical bonding and periodic trends are radically different at high pressures relevant to materials under extreme conditions .…”
Section: The Challenge Of the Exceptionalmentioning
confidence: 99%
“…31 Even knowledge of what materials can be made is limited, as many observed materials are metastable, 35 and computational thermochemistry data sets have biased distributions of formation energies for different structure types. 36 Furthermore, nearly all experimental and computational data consider low-pressure systems, yet chemical bonding and periodic trends are radically different at high pressures relevant to materials under extreme conditions. 37 There remains plenty of room to discover new materials and molecules, and we are far from the regime of pure interpolation.…”
Section: The Challenge Of the Exceptional Iia What Is Exceptional?mentioning
confidence: 99%
See 1 more Smart Citation
“…31 Even knowledge of what materials can be made is limited, as many observed materials are metastable, 35 and computational thermochemistry datasets have biased distributions of formation energies for different structure types. 36 Furthermore, nearly all experimental and computational data considers low-pressure systems, yet chemical bonding and periodic trends are radically different at high pressures relevant to materials under extreme conditions. 37 There remains plenty of room to discover new materials and molecules, and we are far from the regime of pure interpolation.…”
Section: Figurementioning
confidence: 99%
“…The information entropy of the observed property distribution can be useful for identifying outcome imbalances, and active learning used to prioritize new samples to correct these imbalances, recently demonstrated in the context of formation energy/structure biases of intermetallic compounds. 36 To understand how properties are coupled to one another, it might even be useful to fill in equally rare contraindicated regions with undesirable tradeoffs ("anti-exceptional materials"). Learning general ways to reach arbitrary outputs, can serve as steppingstones to the desired solution.…”
Section: Try To Fill-in-the-blanks Of Input and Output Spacementioning
confidence: 99%