In this paper, we examine the complexities involved in retrieving images from a database comprised of objects of very similar appearance. Such an operation requires a process that can discriminate among images at a very fine level, such as distinguishing among various species of fish. Furthermore, incidental environmental factors such as change in viewpoints and slight, nonessential shape deformation must be excluded from the similarity criteria. To this end, we propose a new method for content-based image retrieval and indexing, one that is well suited for discriminating among objects within the same class in a way that is insensitive to incidental environmental changes. The scheme comprises a global alignment and a local matching process. Affine transform is used to model the different viewpoints associated with positioning the camera, while multidimensional indexing techniques are used to make the global alignment scheme efficient.A local matching process based on dynamic programming allows the optimal matching of local structures using cost metrics that may ignore nonessential local shape deformation.Results show the method's ability to cancel out visual distortions caused by a changing viewpoint, and its tolerance to noise, occlusion, and slight deformations of the object.