The study aimed to achieve the following objectives: (1) to perform the fusion of thermal and visible tongue images with various fusion rules of discrete wavelet transform (DWT) to classify diabetes and normal subjects; (2) to obtain the statistical features in the required region of interest from the tongue image before and after fusion; (3) to distinguish the healthy and diabetes using fused tongue images based on deep and machine learning algorithms. The study participants comprised of 80 normal subjects and age- and sex-matched 80 diabetes patients. The biochemical tests such as fasting glucose, postprandial, Hba1c are taken for all the participants. The visible and thermal tongue images are acquired using digital single lens reference camera and thermal infrared cameras, respectively. The digital and thermal tongue images are fused based on the wavelet transform method. Then Gray level co-occurrence matrix features are extracted individually from the visible, thermal, and fused tongue images. The machine learning classifiers and deep learning networks such as VGG16 and ResNet50 was used to classify the normal and diabetes mellitus. Image quality metrics are implemented to compare the classifiers’ performance before and after fusion. Support vector machine outperformed the machine learning classifiers, well after fusion with an accuracy of 88.12% compared to before the fusion process (Thermal-84.37%; Visible-63.1%). VGG16 produced the classification accuracy of 94.37% after fusion and attained 90.62% and 85% before fusion of individual thermal and visible tongue images, respectively. Therefore, this study results indicates that fused tongue images might be used as a non-contact elemental tool for pre-screening type II diabetes mellitus.