AIM: Optimal molecular markers for detecting colorectal cancer (CRC) in a blood-based assay were evaluated.
METHODS:A matched (by variables of age and sex) case-control design (111 CRC and 227 non-cancer samples) was applied. Total RNAs isolated from the 338 blood samples were reverse-transcribed, and the relative transcript levels of candidate genes were analyzed. The training set was made of 162 random samples of the total 338 samples. A logistic regression analysis was performed, and odds ratios for each gene were determined between CRC and non-cancer. The samples (n = 176) in the testing set were used to validate the logistic model, and an inferred performance (generality) was verified. By pooling 12 public microarray datasets (GSE 4107, 4183, 8671, 9348, 10961, 13067, 13294, 13471, 14333, 15960, 17538, and 18105), which included 519 cases of adenocarcinoma and 88 controls of normal mucosa, we were able to verify the selected genes from logistic models and estimate their external generality.
RESULTS:The logistic regression analysis resulted in the selection of five significant genes (P < 0.05; MDM2 , DUSP6 , CPEB4 , MMD , and EIF2S3 ), with odds ratios of 2.978, 6.029, 3.776, 0.538 and 0.138, respectively. The five-gene model performed stably for the discrimination of CRC cases from controls in the training set, with accuracies ranging from 73.9% to 87.0%, a sensitivity of 95% and a specificity of 95%. In addition, a good performance in the test set was obtained using the discrimination model, providing 83.5% ac- = 0.853, AUC = 0.978, accuracy = 0.949, specificity = 0.818 and sensitivity = 0.971).
RETROSPECTIVE STUDY