Oral cancer is a major global health issue accounting for 177,384 deaths in 2018 and it is most prevalent in low-and middle-income countries. Enabling automation in the identification of potentially malignant and malignant lesions in the oral cavity would potentially lead to low-cost and early diagnosis of the disease. Building a large library of well-annotated oral lesions is key. As part of the MeMoSA ® (Mobile Mouth Screening Anywhere) project, images are currently in the process of being gathered from clinical experts from across the world, who have been provided with an annotation tool to produce rich labels. A novel strategy to combine bounding box annotations from multiple clinicians is provided in this paper. Further to this, deep neural networks were used to build automated systems, in which complex patterns were derived for tackling this difficult task. Using the initial data gathered in this study, two deep learning based computer vision approaches were assessed for the automated detection and classification of oral lesions for the early detection of oral cancer, these were image classification with ResNet-101 and object detection with the Faster R-CNN. Image classification achieved an F1 score of 87.07% for identification of images that contained lesions and 78.30% for the identification of images that required referral. Object detection achieved an F1 score of 41.18% for the detection of lesions that required referral. Further performances are reported with respect to classifying according to the type of referral decision. Our initial results demonstrate deep learning has the potential to tackle this challenging task. INDEX TERMS Composite annotation, deep learning, image classification, object detection, oral cancer, oral potentially malignant disorders.
Oral cancer is most prevalent in low-and middle-income countries where it is associated with late diagnosis. A significant factor for this is the limited access to specialist diagnosis. The use of artificial intelligence for decision making on oral cavity images has the potential to improve cancer management and survival rates. This study forms part of the MeMoSA ® (Mobile Mouth Screening Anywhere) project. In this paper, we extended on our previous deep learning work and focused on the binary image classification of 'referral' vs. 'non-referral'. Transfer learning was applied, with several common pre-trained deep convolutional neural network architectures compared for the task of finetuning to a small oral image dataset. Improvements to our previous work were made, with an accuracy of 80.88% achieved and a corresponding sensitivity of 85.71% and specificity of 76.42%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.