Background: Circulating tumor cells (CTCs) acting as "liquid biopsy" of cancer are cells that have been shed from the primary tumor, which cause the development of a secondary tumor in a distant organ site, leading to cancer metastasis. Recent research suggests that CTCs with abnormalities in gene copy numbers in mononuclear cell-enriched peripheral blood samples, namely circulating genetically abnormal cells (CACs), could be used as a non-invasive decision tool to detect patients with benign pulmonary nodules. Such cells are identified by counting the fluorescence signals of fluorescence in situ hybridization (FISH). However, owing to the rarity of CACs in the blood, identification of CACs using this technique is time-consuming and is a drawback of this method.Methods: This study has proposed an efficient and automatic FISH-based CACs identification approach which is based on a combination of the high accuracy of You Only Look Once (YOLO)-V4 and the lightweight and rapidness of MobileNet-V3. The backbone of YOLO-V4 was replaced with MobileNet-V3 to improve the detection efficiency and prevent overfitting, and the architecture of YOLO-V4 was optimized by utilizing a new feature map with a larger scale to enable the enhanced detection ability for small targets.Results: We trained and tested the proposed model using a dataset containing more than 7,000 cells based on five-fold cross-validation. All the images in the dataset were 2,448×2,048 (pixels) in size. The number of cells in each image was >70. The accuracy of four-color fluorescence signals detection for our proposed model were all approximately 98%, and the mean average precision (mAP) were close to 100%. The final outcome of the developed method was the type of cells, i.e., normal cells, CACs, gaining cells or deletion cells. The method had a CACs identification accuracy of 93.86% (similar to an expert pathologist), and a detection speed that was about 500 times greater than that of a pathologist.
Conclusions:The developed method could greatly increase the review accuracy, enhance the efficiency of reviewers, and reduce the review turnaround time during CACs identification.