2016
DOI: 10.1186/s13755-016-0015-4
|View full text |Cite
|
Sign up to set email alerts
|

Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction

Abstract: BackgroundPredictive modeling is a key component of solutions to many healthcare problems. Among all predictive modeling approaches, machine learning methods often achieve the highest prediction accuracy, but suffer from a long-standing open problem precluding their widespread use in healthcare. Most machine learning models give no explanation for their prediction results, whereas interpretability is essential for a predictive model to be adopted in typical healthcare settings.MethodsThis paper presents the fi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
84
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 87 publications
(84 citation statements)
references
References 25 publications
0
84
0
Order By: Relevance
“…39 In addition, most machine learning models are complex and difficult to interpret because they depend heavily on aspects related to feature distribution, data availability and data representation. 40 In the present study we built and validated a simple and interpretable algorithm with excellent accuracy. Despite the high PPV and NPV in the stable, adequate glycaemic control trajectory, the PPV in the deteriorated glycaemic control trajectory was only 45.8% in the validation cohort.…”
Section: Discussionmentioning
confidence: 93%
See 1 more Smart Citation
“…39 In addition, most machine learning models are complex and difficult to interpret because they depend heavily on aspects related to feature distribution, data availability and data representation. 40 In the present study we built and validated a simple and interpretable algorithm with excellent accuracy. Despite the high PPV and NPV in the stable, adequate glycaemic control trajectory, the PPV in the deteriorated glycaemic control trajectory was only 45.8% in the validation cohort.…”
Section: Discussionmentioning
confidence: 93%
“…One of the reasons for this could be that data obtained from EHRs are considered a byproduct of healthcare delivery, rather than a resource to improve its performance . In addition, most machine learning models are complex and difficult to interpret because they depend heavily on aspects related to feature distribution, data availability and data representation . In the present study we built and validated a simple and interpretable algorithm with excellent accuracy.…”
Section: Discussionmentioning
confidence: 98%
“…The clinical and administrative dataset is deidentified and publicly available from the Practice Fusion Diabetes Classification Challenge [ 15 , 34 ], containing 3-year (2009-2012) records as well as the labels of 9948 adult patients from all US states in the following year. A total of 1904 of these patients had a diagnosis of type 2 diabetes in the following year.…”
Section: Methodsmentioning
confidence: 99%
“…Historically, machine learning was blamed for being a black box. A recent method can automatically explain any machine learning model’s classification results with no accuracy loss [ 14 , 15 ]. Yet, two hurdles remain in using machine learning in health care.…”
Section: Introductionmentioning
confidence: 99%
“…Only a small number of the methods that are listed in ►Table 1 have been applied to predicting clinical outcomes. For example, Luo applied their method to type-2 diabetes risk prediction 18 , Štrumbelj et al developed and applied their method to breast cancer recurrence predictions, 19 and Reggia and Perricone developed explanations for predictions of the type of stroke. 11 More widespread application of these methods to clinical predictions can provide evidence of applicability and utility of these methods to clinical users.…”
Section: Background and Significancementioning
confidence: 99%