The sounds of certain industrial products (machines) carry important information about these products. Product classification or malfunction detection can be performed utilizing a product’s sound. In this regard, sound can be used as it is or it can be mapped to either features or images. The latter enables the implementation of recently achieved performance improvements with respect to image processing. In this paper, the sounds of seven industrial products are mapped into mel-spectrograms. The similarities of these images within the same class (machine type) and between classes, representing the intraclass and interclass similarities, respectively, are investigated. Three often-used image similarity measures are applied: Euclidean distance (ED), the Pearson correlation coefficient (PCC), and the structural similarity index (SSIM). These measures are mutually compared to analyze their behaviors in a particular use-case. According to the obtained results, the mel-spectrograms of five classes are similar, while two classes have unique properties manifested in considerably larger intraclass as opposed to interclass similarity. The applied image similarity measures lead to similar general results showing the same main trends, but there are differences among them as mutual relationship of similarity among classes. The differences between the images are more blurred when the SSIM is applied than using ED and the PCC.