2022
DOI: 10.2196/33440
|View full text |Cite
|
Sign up to set email alerts
|

The Application and Comparison of Machine Learning Models for the Prediction of Breast Cancer Prognosis: Retrospective Cohort Study

Abstract: Background Over the recent years, machine learning methods have been increasingly explored in cancer prognosis because of the appearance of improved machine learning algorithms. These algorithms can use censored data for modeling, such as support vector machines for survival analysis and random survival forest (RSF). However, it is still debated whether traditional (Cox proportional hazard regression) or machine learning-based prognostic models have better predictive performance. … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(24 citation statements)
references
References 32 publications
4
7
0
Order By: Relevance
“…In contrast, machine learning and deep learning studies focus on clinical data and their applications to patient‐level prediction for breast cancer are limited. Studies by Ganggayah et al, 17 Xiao et al, 18 and Huang et al 19 using machine learning algorithms to predict the overall survival of breast cancer patients showed comparable performance to our research. Although RF was not the best among those algorithms, it performed well in all four studies.…”
Section: Discussionsupporting
confidence: 83%
See 2 more Smart Citations
“…In contrast, machine learning and deep learning studies focus on clinical data and their applications to patient‐level prediction for breast cancer are limited. Studies by Ganggayah et al, 17 Xiao et al, 18 and Huang et al 19 using machine learning algorithms to predict the overall survival of breast cancer patients showed comparable performance to our research. Although RF was not the best among those algorithms, it performed well in all four studies.…”
Section: Discussionsupporting
confidence: 83%
“…[37][38][39] As our study focused on the overall deaths of breast cancer patients, we took into consideration not only breast cancerspecific factors but also general health-related factors. Another important feature of our model was CCI score, a tool used for over [40][41][42][43] this variable was not considered in previous machine learning studies that had a similar aim to ours, [17][18][19]35 as these studies mainly focused on tumor characteristics. Hypertension, a comorbidity not included in the CCI, was another variable that contributed to the models' performance.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Feature importance is the average measure of how significant a feature is in comparison to other features used in the ensemble model to predict the outcome variable; higher feature importance means that that feature is used to differentiate one outcome versus the other more frequently [46]. This technique has been widely used in previous public health studies to identify predictors/factors of various health outcomes [46][47][48][49]. Furthermore, it has been demonstrated to be the best feature selection method when compared to other feature selection methods such as Boruta and recursive feature elimination techniques, which makes it extremely useful and efficient in selecting the important variables [50].…”
Section: Feature Engineeringmentioning
confidence: 99%
“…Harrell's concordance index (C-index), which calculates the proportion of observation pairs for which the model predictions and observed survival times agree [19]. Several authors have used C_index to make comparison among different survival models [19][20][21].Values range from 0.5 to 1, with a C index of 0 indicating a model with no discrimination and a C-Index of 1, a perfect model. As shown in Table 5, the entire model showed similar performance as measured by performance metrics, with nearly identical C indices.…”
Section: Comparison Between Modelsmentioning
confidence: 99%