2023
DOI: 10.1093/nsr/nwad125
|View full text |Cite
|
Sign up to set email alerts
|

Data quantity governance for machine learning in materials science

Abstract: Data-driven machine learning is widely employed in the analysis of materials structure-activity relationship, performance optimization and materials design due to its superior ability to reveal latent data patterns and make accurate prediction. However, because of the laborious process of materials data acquisition, machine learning models encounter the issue of the mismatch between high dimension of feature space and small sample size (for traditional machine learning models) or the mismatch between model par… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
28
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 62 publications
(28 citation statements)
references
References 91 publications
0
28
0
Order By: Relevance
“…However, the application of ML/DL for inorganic device development optimization remains challenging due to the limited availability of experimental data, especially at the research level, and this issue has been discussed in the literature. 7,8 High-throughput experiments that generate data automatically have also shown promise in material search and development as an experimental approach. [9][10][11][12] Another approach is to integrate the domain knowledge into the training of the model function to provide reasonable prediction.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…However, the application of ML/DL for inorganic device development optimization remains challenging due to the limited availability of experimental data, especially at the research level, and this issue has been discussed in the literature. 7,8 High-throughput experiments that generate data automatically have also shown promise in material search and development as an experimental approach. [9][10][11][12] Another approach is to integrate the domain knowledge into the training of the model function to provide reasonable prediction.…”
Section: Introductionmentioning
confidence: 99%
“…[9][10][11][12] Another approach is to integrate the domain knowledge into the training of the model function to provide reasonable prediction. 8 The ML/DL techniques have recently been applied for photocatalytic materials and devices that have the ability of splitting water into oxygen and hydrogen using sunlight energy. 13 The application of ML/DL for inorganic device development optimization remains challenging due to the limited availability of experimental data.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…A configurational sampling relying on MD trajectories or energetic local minima from DFT calculations is highly sparse and may be insufficient to reflect the actual molecular motion. Recently, sample-oriented methods are frequently used to govern the data quantity . For example, the idea of active learning is adopted, which performs an iterative process to gradually enlarge the data set size until convergence is achieved. , However, the data set may be highly sensitive to the initial sampling pool and remain biased and incomplete.…”
mentioning
confidence: 99%