Background Deep learning, which is a part of a broader concept of artificial intelligence (AI) and/or machine learning has achieved remarkable success in vision tasks. While there is growing interest in the use of this technology in diagnostic support for skin-related neglected tropical diseases (skin NTDs), there have been limited studies in this area and fewer focused on dark skin. In this study, we aimed to develop deep learning based AI models with clinical images we collected for five skin NTDs, namely, Buruli ulcer, leprosy, mycetoma, scabies, and yaws, to understand how diagnostic accuracy can or cannot be improved using different models and training patterns. Methodology This study used photographs collected prospectively in Côte d'Ivoire and Ghana through our ongoing studies with use of digital health tools for clinical data documentation and for teledermatology. Our dataset included a total of 1,709 images from 506 patients. Two convolutional neural networks, ResNet-50 and VGG-16 models were adopted to examine the performance of different deep learning architectures and validate their feasibility in diagnosis of the targeted skin NTDs. Principal findings The two models were able to correctly predict over 70% of the diagnoses, and there was a consistent performance improvement with more training samples. The ResNet-50 model performed better than the VGG-16 model. Model trained with PCR confirmed cases of Buruli ulcer yielded 1-3% increase in prediction accuracy over training sets including unconfirmed cases. Conclusions Our approach was to have the deep learning model distinguish between multiple pathologies simultaneously – which is close to real-world practice. The more images used for training, the more accurate the diagnosis became. The percentages of correct diagnosis increased with PCR-positive cases of Buruli ulcer. This demonstrated that it may be better to input images from the more accurately diagnosed cases in the training models also for achieving better accuracy in the generated AI models. However, the increase was marginal which may be an indication that the accuracy of clinical diagnosis alone is reliable to an extent for Buruli ulcer. Diagnostic tests also have its flaws, and they are not always reliable. One hope for AI is that it will objectively resolve this gap between diagnostic tests and clinical diagnoses with addition of another tool. While there are still challenges to be overcome, there is a potential for AI to address the unmet needs where access to medical care is limited, like for those affected by skin NTDs.