“…A hierarchy of analytic methods exists for assessing calibration ranging from weak (intercept and calibration slope), moderate (calibration curve showing estimated risk of death on the x-axis and observed deaths on the y-axis) to strong (plotting calibration in patients with similar patterns of covariates) [4]. Of the 15 studies, only six used various weak to moderate approaches for assessing the calibration [3] or reported the Hosmer-Lemeshow test p value that has known limitations as a calibration test [4]. Nevertheless, the observed to expected (O:E) risk ratio is commonly reported or can be extracted from the total number of the observed and expected outcome data from a Hosmer-Lemeshow goodness-of-fit table published in a primary study.…”