Three main approaches on how audio signals can be used as input to a deep learning model are: extracting hand-crafted features from audio signals, mapping audio signals into appropriate images such as spectrogram-like ones, and using directly raw audio signals. Among these approaches, the usage of spectrogram-like images represents a compromise regarding the bias enforced by the processing (seen in hand-crafted features) and computational demands (seen in raw audio signals). When any of the spectrogram-like images is used as a deep learning model input, then different techniques for image processing become available and can be implemented. They include techniques for assessing the image similarity, implementing image matching, and image recognition. The topic of this paper is similarity of spectrogram-like images obtained from DC motor sounds. In that respect, relevant measures of image similarity are first reviewed, and then one of them - the Pearson correlation coefficient - is applied for evaluating the similarity within the same class and between two classes of different spectrogram-like images.