Background
The most prevalent mesenchymal-derived gastrointestinal cancers are gastric stromal tumors (GSTs), which have the highest incidence (60–70%) of all gastrointestinal stromal tumors (GISTs). However, simple and effective diagnostic and screening methods for GST remain a great challenge at home and abroad. This study aimed to build a GST early warning system based on a combination of machine learning algorithms and routine blood, biochemical and tumour marker indicators.
Methods
In total, 697 complete samples were collected from four hospitals in Gansu Province, including 42 blood indicators from 318 pretreatment GST patients, 180 samples of gastric polyps and 199 healthy individuals. In this study, three algorithms, gradient boosting machine (GBM), random forest (RF), and logistic regression (LR), were chosen to build GST prediction models for comparison. The performance and stability of the models were evaluated using two different validation techniques: 5-fold cross-validation and external validation. The DeLong test assesses significant differences in AUC values by comparing different ROC curves, the variance and covariance of the AUC value.
Results
The AUC values of both the GBM and RF models were higher than those of the LR model, and this difference was statistically significant (P < 0.05). The GBM model was considered to be the optimal model, as a larger area was enclosed by the ROC curve, and the axes indicated robust model classification performance according to the accepted model discriminant. Finally, the integration of 8 top-ranked blood indices was proven to be able to distinguish GST from gastric polyps and healthy people with sensitivity, specificity and area under the curve of 0.941, 0.807 and 0.951 for the cross-validation set, respectively.
Conclusion
The GBM demonstrated powerful classification performance and was able to rapidly distinguish GST patients from gastric polyps and healthy individuals. This identification system not only provides an innovative strategy for the diagnosis of GST but also enables the exploration of hidden associations between blood parameters and GST for subsequent studies on the prevention and disease surveillance management of GST. The GST discrimination system is available online for free testing of doctors and high-risk groups at https://jzlyc.gsyy.cn/bear/mobile/index.html.