Objectives The study was to apply deep learning (DL) with convolutional
neural networks (CNNs) to laryngoscopic imaging for assisting in
real-time automated segmentation and classification of vocal cord
leukoplakia. Methods This was a single-center retrospective diagnostic
study included 216 patients who underwent laryngoscope and pathological
examination from October 1, 2018 through October 1, 2019. Lesions were
classified as nonsurgical group (NSG) and surgical group (SG) according
to pathology. All selected images of vocal cord leukoplakia were
annotated independently by 2 expert endoscopists and divided into a
training set, a validation set, and a test set in a ratio of 6:2:2 for
training the model. Results Among the 260 lesions identified in 216
patients, 2220 images from narrow band imaging (NBI) and 2144 images
from white light imaging (WLI) were selected. For segmentation, the
average intersection-over-union (IoU) value exceeded 70%. For
classification, the model was able to classify the surgical group (SG)
by laryngoscope with a sensitivity of 0.93 and specificity of 0.94 in
WLI, and a sensitivity of 0.99 and specificity of 0.97 in NBI. Moreover,
this model achieved a mean average precision (mAP) of 0.81 in WLI and
0.92 in NBI with an IoU> 0.5. Conclusions The study found
that a model developed by applying DL with CNNs to laryngoscopic imaging
results in high sensitivity, specificity, and mAP for automated
segmentation and classification of vocal cord leukoplakia. This finding
shows promise for the application of DL with CNNs in assisting in
accurate diagnosis of vocal cord leukoplakia from WLI and NBI.