Automated Detection of Laryngeal Carcinoma in Laryngoscopic Images from a Multicenter Database using a Convolutional Neural Network

Yan, Peikai; Li, Shaohua; Zhou, Zhou; Liu, Qian; Wu, Jiahui; Ren, Qingyi; Chen, Qiuhuan; Chen, Zhipeng; Chen, Ze; Chen, Shaohua; Scholp, Austin; Jiang, Jack J.; Kang, Jing; Ge, Pingjiang

doi:10.22541/au.163285523.38983442/v1

Cited by 1 publication

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, Cho et al reported the following two related papers: in one study, they applied four CNN models (six-layer CNN, VGG16, Inception-V3 and Xception) to laryngoscopic vocal fold images to classify the image into abnormal and normal [ 15 ], and in the other study, they applied four CNN models (VGG16, Inception-V3, MobileNet-V2 and EfficientNet-B0) to classify laryngeal diseases (cysts, nodules, polyps, leukoplakia, papillomas, Reinke’s edema, granulomas, palsies and normal) [ 16 ]; You et al applied 13 CNN models (AlexNet, four VGG models, three ResNet models, three DenseNet models, Inception-V3, and the proposed) to classify laryngeal leukoplakia (inflammatory keratosis, mild/moderate/severe dysplasia, and squamous cell carcinoma) using white-light endoscopy images [ 17 ]; Eggert et al applied DenseNet models to classify hyperspectral images of laryngeal, hypopharyngeal, and oropharyngeal mucosa into abnormal and normal [ 18 ]. Moreover, Hu et al applied Mask R-CNN with ResNet-50 backbone to two types of laryngoscopic imaging (narrow-band imaging and white-light imaging) for automated real-time segmentation and classification of vocal cord leukoplakia to classify the lesions into surgical and non-surgical groups [ 19 ]; Yan et al applied the Faster R-CNN model to laryngoscopic images of vocal lesions to screen for laryngeal carcinoma [ 20 ]; Kim et al applied the Mask R-CNN model to laryngoscopic images for real-time segmentation of laryngeal mass around the vocal cord [ 21 ]; Cen et al applied three CNN models (Faster R-CNN, Yolo V3, and SSD) to detect laryngeal tumors in endoscopic images (vocal fold, tumor, surgical tools, and other laryngeal tissues) [ 22 ]; Azam et al applied up to nine Yolo models to laryngoscopic video for real-time detection of laryngeal squamous cell carcinoma in both white-light and narrow-band imaging [ 23 ]. Among these previous studies on vocal area disease detection, eight [ 11 – 18 ] used AI models for classification and, therefore, were not able to provide information about the tumor-suspicious positions in the image.…”

Section: Related Workmentioning

confidence: 99%

“…Among these previous studies on vocal area disease detection, eight [ 11 – 18 ] used AI models for classification and, therefore, were not able to provide information about the tumor-suspicious positions in the image. Similar to the current study, five other studies [ 19 – 23 ] used AI models for object detection that can provide tumor-suspicious positions around the vocal cords; however, they commonly used only single-group disease images, such as vocal cord leukoplakia [ 19 ], laryngeal carcinoma [ 20 , 23 ], laryngeal mass [ 21 ], and cancer [ 22 ].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Convolutional neural network-based vocal cord tumor classification technique for home-based self-prescreening purpose

Kim,

Hwang,

Lee

et al. 2023

BioMed Eng OnLine

View full text Add to dashboard Cite

Background In this study, we proposed a deep learning technique that can simultaneously detect suspicious positions of benign vocal cord tumors in laparoscopic images and classify the types of tumors into cysts, granulomas, leukoplakia, nodules and polyps. This technique is useful for simplified home-based self-prescreening purposes to detect the generation of tumors around the vocal cord early in the benign stage. Results We implemented four convolutional neural network (CNN) models (two Mask R-CNNs, Yolo V4, and a single-shot detector) that were trained, validated and tested using 2183 laryngoscopic images. The experimental results demonstrated that among the four applied models, Yolo V4 showed the highest F1-score for all tumor types (0.7664, cyst; 0.9875, granuloma; 0.8214, leukoplakia; 0.8119, nodule; and 0.8271, polyp). The model with the lowest false-negative rate was different for each tumor type (Yolo V4 for cysts/granulomas and Mask R-CNN for leukoplakia/nodules/polyps). In addition, the embedded-operated Yolo V4 model showed an approximately equivalent F1-score (0.8529) to that of the computer-operated Yolo-4 model (0.8683). Conclusions Based on these results, we conclude that the proposed deep-learning-based home screening techniques have the potential to aid in the early detection of tumors around the vocal cord and can improve the long-term survival of patients with vocal cord tumors.

show abstract