2024
DOI: 10.1038/s41467-024-46663-4
|View full text |Cite
|
Sign up to set email alerts
|

Data-driven identification of predictive risk biomarkers for subgroups of osteoarthritis using interpretable machine learning

Rikke Linnemann Nielsen,
Thomas Monfeuga,
Robert R. Kitchen
et al.

Abstract: Osteoarthritis (OA) is increasing in prevalence and has a severe impact on patients’ lives. However, our understanding of biomarkers driving OA risk remains limited. We developed a model predicting the five-year risk of OA diagnosis, integrating retrospective clinical, lifestyle and biomarker data from the UK Biobank (19,120 patients with OA, ROC-AUC: 0.72, 95%CI (0.71–0.73)). Higher age, BMI and prescription of non-steroidal anti-inflammatory drugs contributed most to increased OA risk prediction ahead of dia… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(1 citation statement)
references
References 86 publications
0
1
0
Order By: Relevance
“…We trained an XGBoost classifier (Chen and Guestrin, 2016) to predict whether the omics data originated from UC or CD patients (CD=703 and UC=320) (Supplementary Figure 3). We chose this model as it is considered to be state-of-the-art in generic tabular data as well as on similar large biomedical cohorts (Nielsen et al, 2024). Furthermore, while neural network architectures could have been used, we preferred XGBoost due its inherent interpretability, and its aforementioned consistent high performance in similar data.…”
Section: Building An ML Model To Classify Uc Vs Cdmentioning
confidence: 99%
“…We trained an XGBoost classifier (Chen and Guestrin, 2016) to predict whether the omics data originated from UC or CD patients (CD=703 and UC=320) (Supplementary Figure 3). We chose this model as it is considered to be state-of-the-art in generic tabular data as well as on similar large biomedical cohorts (Nielsen et al, 2024). Furthermore, while neural network architectures could have been used, we preferred XGBoost due its inherent interpretability, and its aforementioned consistent high performance in similar data.…”
Section: Building An ML Model To Classify Uc Vs Cdmentioning
confidence: 99%