Carl Moons and colleagues provide a checklist and background explanation for critically appraising and extracting data from systematic reviews of prognostic and diagnostic prediction modelling studies.
Please see later in the article for the Editors' Summary
Walter Bouwmeester and colleagues investigated the reporting and methods of prediction studies in 2008, in six high-impact general medical journals, and found that the majority of prediction studies do not follow current methodological recommendations.
BackgroundThe interest in prognostic reviews is increasing, but to properly review existing evidence an accurate search filer for finding prediction research is needed. The aim of this paper was to validate and update two previously introduced search filters for finding prediction research in Medline: the Ingui filter and the Haynes Broad filter.Methodology/Principal FindingsBased on a hand search of 6 general journals in 2008 we constructed two sets of papers. Set 1 consisted of prediction research papers (n = 71), and set 2 consisted of the remaining papers (n = 1133). Both search filters were validated in two ways, using diagnostic accuracy measures as performance measures. First, we compared studies in set 1 (reference) with studies retrieved by the search strategies as applied in Medline. Second, we compared studies from 4 published systematic reviews (reference) with studies retrieved by the search filter as applied in Medline. Next – using word frequency methods – we constructed an additional search string for finding prediction research. Both search filters were good in identifying clinical prediction models: sensitivity ranged from 0.94 to 1.0 using our hand search as reference, and 0.78 to 0.89 using the systematic reviews as reference. This latter performance measure even increased to around 0.95 (range 0.90 to 0.97) when either search filter was combined with the additional string that we developed. Retrieval rate of explorative prediction research was poor, both using our hand search or our systematic review as reference, and even combined with our additional search string: sensitivity ranged from 0.44 to 0.85.Conclusions/SignificanceExplorative prediction research is difficult to find in Medline, using any of the currently available search filters. Yet, application of either the Ingui filter or the Haynes broad filter results in a very low number missed clinical prediction model studies.
BackgroundWhen study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions.MethodsUsing an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated.ResultsThe model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept.ConclusionThe models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.
We recommend at least 10 EPV to fit prediction models in clustered data using logistic regression. Up to 50 EPV may be needed when variable selection is performed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.