Objective To develop and externally validate risk prediction equations to estimate absolute and conditional survival in patients with colorectal cancer.
Design Cohort study.
Setting General practices in England providing data for the QResearch database linked to the national cancer registry.
Participants 44 145 patients aged 15-99 with colorectal cancer from 947 practices to derive the equations. The equations were validated in 15 214 patients with colorectal cancer from 305 different QResearch practices and 437 821 patients with colorectal cancer from the national cancer registry.
Main outcome measures The primary outcome was all cause mortality and secondary outcome was colorectal cancer mortality.
Methods Cause specific hazards models were used to predict risks of colorectal cancer mortality and other cause mortality accounting for competing risks, and these risk estimates were combined to obtain risks of all cause mortality. Separate equations were derived for men and women. Several variables were tested: age, ethnicity, deprivation score, cancer stage, cancer grade, surgery, chemotherapy, radiotherapy, smoking status, alcohol consumption, body mass index, family history of bowel cancer, anaemia, liver function test result, comorbidities, use of statins, use of aspirin, clinical values for anaemia, and platelet count. Measures of calibration and discrimination were determined in both validation cohorts at 1, 5, and 10 years.
Results The final models included the following variables in men and women: age, deprivation score, cancer stage, cancer grade, smoking status, colorectal surgery, chemotherapy, family history of bowel cancer, raised platelet count, abnormal liver function, cardiovascular disease, diabetes, chronic renal disease, chronic obstructive pulmonary disease, prescribed aspirin at diagnosis, and prescribed statins at diagnosis. Improved survival in women was associated with younger age, earlier stage of cancer, well or moderately differentiated cancer grade, colorectal cancer surgery (adjusted hazard ratio 0.50), family history of bowel cancer (0.62), and prescriptions for statins (0.77) and aspirin (0.83) at diagnosis, with comparable results for men. The risk equations were well calibrated, with predicted risks closely matching observed risks. Discrimination was good in men and women in both validation cohorts. For example, the five year survival equations on the QResearch validation cohort explained 45.3% of the variation in time to colorectal cancer death for women, the D statistic was 1.86, and Harrell’s C statistic was 0.80 (both measures of discrimination, indicating that the scores are able to distinguish between people with different levels of risk). The corresponding results for all cause mortality were 42.6%, 1.77, and 0.79.
Conclusions Risk prediction equations were developed and validated to estimate overall and conditional survival of patients with colorectal cancer accounting for an individual’s clinical and demographic characteristics. These equations can provide more individual...