Background: An optimal risk-scoring system enables more targeted offers for colonoscopy in colorectal cancer (CRC) screening. This analysis aims to develop and validate scoring systems using parametric and non-parametric methods for average-risk populations.Methods: Screening data of 807,695 subjects and 2806 detected cases in the firstround CRC screening program in Shanghai were used to develop risk-predictive models and scoring systems using logistic-regression (LR) and artificial-neuralnetwork (ANN) methods. Performance of established scoring systems was evaluated using area under the receiver operating characteristic curve (AUC), calibration, sensitivity, specificity, number of high-risk individuals and potential detection rates of CRC.Results: Age, sex, CRC in first-degree relatives, chronic diarrhoea, mucus or bloody stool, history of any cancer and faecal-immunochemical-test (FIT) results were identified as predictors for the presence of CRC. The AUC of LR-based system was 0.642 when using risk factors only in derivation set, and increased to 0.774 by further incorporating one-sample FIT results, and to 0.808 by including two-sample FIT results, while those for ANN-based systems were 0.639, 0.763 and 0.805, respectively. Better calibrations were observed for the LR-based systems than the ANN-based ones. Compared with the currently used initial tests, parallel use of FIT with LR-based systems resulted in improved specificities, less demands for colonoscopy and higher detection rates of CRC, while parallel use of FIT with ANN-based systems had higher sensitivities; incorporating FIT in the scoring systems further increased specificities, decreased colonoscopy demands and improved detection rates of CRC.
Conclusions: Our results indicate the potentials of LR-based scoring systems incorporating one-or two-sample FIT results for CRC mass screening. External validation is warranted for scaling-up implementation in the Chinese population.