Oral squamous cell carcinoma (OSCC) is one of the deadliest and most common types of cancer. The incidence of OSCC is increasing annually, which requires early diagnosis to receive appropriate treatment. The biopsy technique is one of the most important techniques for analyzing samples, but it takes a long time to get results. Manual diagnosis is still subject to errors and differences in doctors’ opinions, especially in the early stages. Thus, automated techniques can help doctors and patients to receive appropriate treatment. This study developed several hybrid models based on the fused CNN features for diagnosing OSCC-100x and OSCC-400x datasets for oral cancer, which have the ability to analyze medical images with a high level of precision and accuracy. They can detect subtle patterns, abnormalities, or indicators of diseases that may be difficult to recognize with the naked eye. The systems have the potential to significantly reduce human error and provide more consistent and reliable results, resulting in improved diagnostic accuracy. The systems also have the potential for early detection of OSCC for treatment success and improved patient outcomes. By detecting diseases at an early stage, clinicians can initiate interventions in a timely manner, potentially preventing OSCC progression and improving the chances of successful treatment. The first strategy was based on GoogLeNet, ResNet101, and VGG16 models pretrained, which did not achieve satisfactory results. The second strategy was based on GoogLeNet, ResNet101, and VGG16 models based on the adaptive region growing (ARG) segmentation algorithm. The third strategy is based on a mixed technique between GoogLeNet, ResNet101, and VGG16 models and ANN and XGBoost networks based on the ARG hashing algorithm. The fourth strategy for oral cancer diagnosis by ANN and XGBoost is based on features fused between CNN models. The ANN with fusion features of GoogLeNet-ResNet101-VGG16 yielded an AUC of 98.85%, accuracy of 99.3%, sensitivity of 98.2%, precision of 99.5%, and specificity of 98.35%.