2023
DOI: 10.1039/d3dd00082f
|View full text |Cite
|
Sign up to set email alerts
|

Interpretable models for extrapolation in scientific machine learning

Eric S. Muckley,
James E. Saal,
Bryce Meredig
et al.

Abstract: Data-driven models are central to scientific discovery. In efforts to achieve state-of-the-art model accuracy, researchers are employing increasingly complex machine learning algorithms that often outperform simple regressions in interpolative settings...

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
25
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(25 citation statements)
references
References 53 publications
0
25
0
Order By: Relevance
“…Once hypothetically relevant features correlated to the output are selected, relatively simple models can be constructed to make extrapolations. Even simple linear models can be quite effective for this purpose . The features themselves need not have an interpretable relationship to the property being studied ( vide infra ); they merely serve as a proxy for guiding the experiment selection.…”
Section: Recommendations Toward ML For Exceptional Materialsmentioning
confidence: 99%
See 2 more Smart Citations
“…Once hypothetically relevant features correlated to the output are selected, relatively simple models can be constructed to make extrapolations. Even simple linear models can be quite effective for this purpose . The features themselves need not have an interpretable relationship to the property being studied ( vide infra ); they merely serve as a proxy for guiding the experiment selection.…”
Section: Recommendations Toward ML For Exceptional Materialsmentioning
confidence: 99%
“…Model explainability in these early stages is unnecessary because the models will be based on limited data and thus prone to overfitting and oversimplification. Moreover, the most appropriate models for initial discoveryfor both interpretability and extrapolationmay be the types of feature-selected linear models discussed above, obviating the need for more sophisticated black-box model interpretability methods. In fact, empirical studies have found XAI detrimental in uncertain environments, as humans are more likely to reject helpful recommendations because of overconfidence in their troubleshooting abilities .…”
Section: Recommendations Toward ML For Exceptional Materialsmentioning
confidence: 99%
See 1 more Smart Citation
“…24 In fact, simple linear models built with an appropriate combination of input features are often better at extrapolating to novel examples. 25 activities to introduce chemistry students to ML techniques, including the use of ML classifier models to distinguish functional groups in IR spectra, 26 modeling the response of metal nanoparticle colorimetric sensors using neural networks, 27 chemometric analysis of wines, 28 and unsupervised clustering of FTIR and mass-spectrometry data for whisky, tea, and fruit. 29 In addition to teaching practical skills, these activities also implicitly teach students to be aware of limitations and possible failures of ML, including issues with data quantity and quality (e.g., data set imbalances, domain shifts) and effects on prediction quality.…”
Section: ■ Introductionmentioning
confidence: 99%
“…Even simple regularized linear regression models suffice to predict chemical properties as diverse as molecular atomization energies, molecular orbital energies, and interatomic potentials, or to analyze photocurrent spectroscopy experiments . In fact, simple linear models built with an appropriate combination of input features are often better at extrapolating to novel examples . Recent articles in this Journal have described activities to introduce chemistry students to ML techniques, including the use of ML classifier models to distinguish functional groups in IR spectra, modeling the response of metal nanoparticle colorimetric sensors using neural networks, chemometric analysis of wines, and unsupervised clustering of FTIR and mass-spectrometry data for whisky, tea, and fruit .…”
Section: Introductionmentioning
confidence: 99%