Cybernetic vision systems can be deployed in problem domains where the goal is to achieve results similar to those produced by humans. Fundamentally, these problems consist of evaluation of image content between sets of images. This article contrasts two theoretical frameworks for image comparison, namely, the semantic similarity approach used in the earth mover's distance (EMD) and the integrated region matching (IRM) similarity measure, with the tolerance nearness measure (tNM) based on near set theory. The contribution of this article is a comparison of the image similarity measures EMD, IRM, and tNM, as well as a signature-based approach to calculating the tolerance nearness measure.