2021
DOI: 10.2477/jccj.2020-0021
|View full text |Cite
|
Sign up to set email alerts
|

Constructing Regression Models with High Prediction Accuracy and Interpretability Based on Decision Tree and Random Forests

Abstract: Models for predicting properties/activities of materials based on machine learning can lead to the discovery of new mechanisms underlying properties/activities of materials. However, methods for constructing models that exhibit both high prediction accuracy and interpretability remain a work in progress because the prediction accuracy and interpretability exhibit a trade-off relationship. In this study, we propose a new model-construction method that combines decision tree (DT) with random forests (RF); which … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…However, it may differ among high, medium and low y‐values. Shimizu and Kaneko proposed a hybrid model of a decision tree (DT) and RF, in which the importance of the RF is calculated for each leaf node of the DT model, with DT providing a global interpretation of the entire dataset and RF providing a local interpretation for each cluster 33 …”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, it may differ among high, medium and low y‐values. Shimizu and Kaneko proposed a hybrid model of a decision tree (DT) and RF, in which the importance of the RF is calculated for each leaf node of the DT model, with DT providing a global interpretation of the entire dataset and RF providing a local interpretation for each cluster 33 …”
Section: Introductionmentioning
confidence: 99%
“…Shimizu and Kaneko proposed a hybrid model of a decision tree (DT) and RF, in which the importance of the RF is calculated for each leaf node of the DT model, with DT providing a global interpretation of the entire dataset and RF providing a local interpretation for each cluster. 33 In this study, the number of defects that occur during the mass production of precision electrical components by a Japanese manufacturer should be decreased. The process consists of 11 chemical and physical unit operations.…”
mentioning
confidence: 99%
“…Although this importance is calculated considering the entire value of y , the importance of each X can vary depending on whether the y value is high, middle, or low. Shimizu and Kaneko proposed the DT and RF hybrid model that successfully interpreted global and local relationships between y and X …”
Section: Introductionmentioning
confidence: 99%
“…Shimizu and Kaneko proposed the DT and RF hybrid model that successfully interpreted global and local relationships between y and X. 11 In addition to RF, there are other contribution indexes such as the local interpretable model-agnostic explanations (LIME) 12 and Shapley additive explanations (SHAP) 13 that can be combined with any regression analysis methods. In LIME and SHAP, by obtaining an approximation of the shape of the model at a certain sample point, the slope of X with respect to y around that sample point is obtained.…”
Section: Introductionmentioning
confidence: 99%
“…The feature importance of x is calculated considering the entire value of y but it can be different when the y value is high, medium, or low. Shimizu and Kaneko (2021) proposed a decision tree (DT) and RF hybrid model where the importance of RF was calculated for each leaf node in the DT model, and thus, DT provided a global interpretation of the entire dataset, and RF provided local interpretations for each cluster. 27 PFI can be used universally with various classification and regression methods such as scikit-learn 28 and also, can be used conveniently.…”
mentioning
confidence: 99%