Background
We re-analyzed data from the Systolic Blood Pressure Intervention Trial (SPRINT) trial to identify features of systolic blood pressure (SBP) variability that portend poor cardiovascular outcomes using a nonlinear machine-learning algorithm.
Methods
We included all patients who completed 1 year of the study without reaching any primary endpoint during the first year, specifically: myocardial infarction, other acute coronary syndromes, stroke, heart failure or death from a cardiovascular event (
n
= 8799; 94%). In addition to clinical variables, features representing longitudinal SBP trends and variability were determined and combined in a random forest algorithm, optimized using cross-validation, using 70% of patients in the training set. Area under the curve (AUC) was measured using a 30% testing set. Finally, feature importance was determined by minimizing node impurity averaging over all trees in the forest for a specific feature.
Results
A total of 365 patients (4.1%) reached the combined primary outcome over 37 months of follow-up. The random forest classifier had an AUC of 0.71 on the testing set. The 10 most significant features selected in order of importance by the automated algorithm included the urine albumin/creatinine (CR) ratio, estimated glomerular filtration rate, age, serum CR, history of subclinical cardiovascular disease (CVD), cholesterol, a variable representing SBP signals using wavelet transformation, high-density lipoprotein, the 90th percentile of SBP and triglyceride level.
Conclusions
We successfully demonstrated use of random forest algorithm to define best prognostic longitudinal SBP representations. In addition to known risk factors for CVD, transformed variables for time series SBP measurements were found to be important in predicting poor cardiovascular outcomes and require further evaluation.