Oral cancer is a major global health issue accounting for 177,384 deaths in 2018 and it is most prevalent in low-and middle-income countries. Enabling automation in the identification of potentially malignant and malignant lesions in the oral cavity would potentially lead to low-cost and early diagnosis of the disease. Building a large library of well-annotated oral lesions is key. As part of the MeMoSA ® (Mobile Mouth Screening Anywhere) project, images are currently in the process of being gathered from clinical experts from across the world, who have been provided with an annotation tool to produce rich labels. A novel strategy to combine bounding box annotations from multiple clinicians is provided in this paper. Further to this, deep neural networks were used to build automated systems, in which complex patterns were derived for tackling this difficult task. Using the initial data gathered in this study, two deep learning based computer vision approaches were assessed for the automated detection and classification of oral lesions for the early detection of oral cancer, these were image classification with ResNet-101 and object detection with the Faster R-CNN. Image classification achieved an F1 score of 87.07% for identification of images that contained lesions and 78.30% for the identification of images that required referral. Object detection achieved an F1 score of 41.18% for the detection of lesions that required referral. Further performances are reported with respect to classifying according to the type of referral decision. Our initial results demonstrate deep learning has the potential to tackle this challenging task. INDEX TERMS Composite annotation, deep learning, image classification, object detection, oral cancer, oral potentially malignant disorders.