During clinical consultations and case training, doctors analyze numerous images and sounds. A high-pressure consultation environment can increase the probability of a doctor making incorrect inferences regarding vocal cord (VC) disease. Therefore, this study applied deep learning to design an edge-based VC disease detection system (EVC-DD) for common VC conditions (e.g., nodules, polyps, and cancer) to assist doctors in conducting consultations and case studies and in verifying the consistency of their disease inferences. Through deep learning, the model extracted and recorded clinically confirmed information in its disease inference model. The experiment data set comprised videos of nodules, polyps, and cancer that were used to evaluate the performance of the proposed model. From 13 cases confirmed by two doctors, 1740 images were extracted from 13 case videos and used in the experiment. In total, 1044 (60%), 348 (20%), and 348 (20%) images were randomly obtained through five-fold cross-validation for training, validation, and testing, respectively. During the model training process, the EVC-DD model achieved 100% accuracy in detecting the three conditions required for optimal experiment results. For the results in the analysis of the independent test data with optimized configuration. the EVC-DD model achieved 99.42%, 99.42%, 99.42%, 99.42%, 98.91%, and 0.9957 for averaged F1 score, averaged recall rate, averaged precision, accuracy, Matthews correlation coefficient, and area under the curve, respectively. The EVC-DD model required only 400 s to complete its training using 1740 images. The results indicate that the inferences of the EVC-DD model were highly consistent with the results of the clinical examination by doctors and that its training was data-and time-efficient, thereby allowing the model to learn new cases quickly. Thus, the EVC-DD model can assist doctors in consultations and case analyses by providing reliable disease inferences and real-time input regarding new case knowledge.