Optimal survival trees

Bertsimas, Dimitris; Dunn, Jack; Gibson, Emma; Orfanoudaki, Agni

doi:10.1007/s10994-021-06117-0

Cited by 25 publications

(24 citation statements)

References 79 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Decision trees (DTs) find a wide range of practical uses [190,189,108,179,26,29,28,69,25,27,30,32,33,115,72,34,37,31,142,145,52,24,174,171,36,67,127,35,19]. Moreover, DTs are the most visible example of a collection of machine learning (ML) models that have recently been advocated as essential for high-risk applications [160].…”

Section: Introductionmentioning

confidence: 99%

On Tackling Explanation Redundancy in Decision Trees

Izza,

Ignatiev,

Marques-Silva

2022

Preprint

View full text Add to dashboard Cite

Decision trees (DTs) epitomize the ideal of interpretability of machine learning (ML) models. The interpretability of decision trees motivates explainability approaches by socalled intrinsic interpretability, and it is at the core of recent proposals for applying interpretable ML models in high-risk applications. The belief in DT interpretability is justified by the fact that explanations for DT predictions are generally expected to be succinct. Indeed, in the case of DTs, explanations correspond to DT paths. Since decision trees are ideally shallow, and so paths contain far fewer features than the total number of features, explanations in DTs are expected to be succinct, and hence interpretable. This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not be deemed interpretable. The paper introduces logically rigorous path explanations and path explanation redundancy, and proves that there exist functions for which decision trees must exhibit paths with explanation redundancy that is arbitrarily larger than the actual path explanation. The paper also proves that only a very restricted class of functions can be represented with DTs that exhibit no explanation redundancy. In addition, the paper includes experimental results substantiating that path explanation redundancy is observed ubiquitously in decision trees, including those obtained using different tree learning algorithms, but also in a wide range of publicly available decision trees. The paper also proposes polynomial-time algorithms for eliminating path explanation redundancy, which in practice require negligible time to compute. Thus, these algorithms serve to indirectly attain irreducible, and so succinct, explanations for decision trees. Furthermore, the paper includes novel results related with duality and enumeration of explanations, based on using SAT solvers as witness-producing NP-oracles.

show abstract

Section: Introductionmentioning

confidence: 99%

On Tackling Explanation Redundancy in Decision Trees

Izza,

Ignatiev,

Marques-Silva

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Various machine learning methods have been designed to compensate for the limitations of the Cox progression models. Tree-based models are appealing owing to their logical and interpretable structures, as well as their ability to detect the complex interactions between covariates ( 5 ). Deep learning-based approaches are based on the automated learning of prognostic factors, without the need for prior assumptions on known factors ( 6 ).…”

Section: Introductionmentioning

confidence: 99%

Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database

et al. 2022

View full text Add to dashboard Cite

BackgroundWe determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database.MethodsIn total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets.ResultsA higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model.ConclusionsWe demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients.

show abstract

“…The main idea is to use previously optimized parameters in subsequent splitting criteria updates, ultimately outputting a single decision tree that can be visually examined. The OST loss function compares how close the predicted e Xβ terms for each patient are to the cumulative survival probabilities, obtained by the Nelson-Aalen estimator [29]. We prioritize model robustness in the training process by: a) limiting the tree size, since too deep or too wide trees obfuscate the model interpretability, b) increasing the number of random restarts to use in the local search algorithm, and c) controlling the minimum number of points that must be present in every leaf node of the fitted trees.…”

Section: Optimal Survival Tree Modelmentioning

confidence: 99%

“…In this retrospective study we explore a cohort of 842 EC patients with 43 clinicopathological and molecular features collected at the Helsinki University Hospital between 2007 and 2012. We report two interpretable models that predict disease-specific survival: a multivariable CPH regression and a visually interpretable optimal survival tree (OST) [29]. Both are built on two sets of variables: a clinical set and an extended set, which is enriched with molecular information of the EC patients, namely L1CAM (CD171) and estrogen receptor (ER) status indicators, as well as the cell cytology and tumor size.…”

Section: Introductionmentioning

confidence: 99%

“…While decision trees can be ensembled leading to better performance than single trees, like in the random survival forest algorithm by Ishwaran et al, this makes them considerably less interpretable [25][26][27]. In light of recent research advances aimed at improving decision tree algorithms through better splitting and pruning criteria, single decision tree models are a good alternative to the CPH regression in the development of explainable clinical prediction models [28,29].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Interpretable prognostic modeling of endometrial cancer

Zagidullin

Pasanen

Loukovaara

et al. 2022

Preprint

View full text Add to dashboard Cite

Endometrial carcinoma (EC) is one of the most common gynecological cancers in the world. In this work we apply Cox proportional hazards (CPH) and optimal survival tree (OST) algorithms to the retrospective prognostic modeling of disease-specific survival in 842 EC patients. We demonstrate that the linear CPH models are preferred for the EC risk assessment based on clinical features alone, while the interpretable, non-linear OST models are favored when patient profiles are enriched with tumor molecular data. By studying the OST decision path structure, we show how explainable tree models recapitulate existing clinical knowledge prioritizing L1 cell-adhesion molecule and estrogen receptor status indicators as key risk factors in the p53 abnormal EC subgroup. We believe that visually interpretable tree algorithms are a promising method to explore feature interactions and generate novel research hypotheses. To aid further clinical adoption of advanced machine learning techniques, we stress the importance of quantifying model discrimination and calibration performance in the development of explainable clinical prediction models.

show abstract

Optimal survival trees

Cited by 25 publications

References 79 publications

On Tackling Explanation Redundancy in Decision Trees

On Tackling Explanation Redundancy in Decision Trees

Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database

Interpretable prognostic modeling of endometrial cancer

Contact Info

Product

Resources

About