2023
DOI: 10.1371/journal.pcbi.1010963
|View full text |Cite
|
Sign up to set email alerts
|

Inferring feature importance with uncertainties with application to large genotype data

Abstract: Estimating feature importance, which is the contribution of a prediction or several predictions due to a feature, is an essential aspect of explaining data-based models. Besides explaining the model itself, an equally relevant question is which features are important in the underlying data generating process. We present a Shapley-value-based framework for inferring the importance of individual features, including uncertainty in the estimator. We build upon the recently published model-agnostic feature importan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 34 publications
0
1
0
Order By: Relevance
“…Finally, while we evaluated the overall contribution of genetic variation in both the XGBoost and GLMM models, we have not evaluated the statistical significance of individual amino acid changes. For the XGBoost-SHAP analysis, methods for evaluating statistical significance are an active area of research (50). In the case of GLMMs, we ran separate versions of the model for each of 28 amino acid changes and did not apply multiple-testing correction methods.…”
Section: Glmm Modeling Resultsmentioning
confidence: 99%
“…Finally, while we evaluated the overall contribution of genetic variation in both the XGBoost and GLMM models, we have not evaluated the statistical significance of individual amino acid changes. For the XGBoost-SHAP analysis, methods for evaluating statistical significance are an active area of research (50). In the case of GLMMs, we ran separate versions of the model for each of 28 amino acid changes and did not apply multiple-testing correction methods.…”
Section: Glmm Modeling Resultsmentioning
confidence: 99%