In the modern era of Internet, mobile and digital information technology, image retrieval for object identification, just as wine label retrieval from a wine bottle image, has become an important and urgent problem in artificial intelligence. In comparison with the general image retrieval, it is rather challenging because there are a huge number of object identification or brand images which are very similar and difficult to discriminate, and the number of different brand images in the given dataset changes greatly, that is, the samples are strongly unbalanced for these brands. In this paper, we propose a CNN-SURF Consecutive Filtering and Matching (CSCFM) framework for this kind of image retrieval, specifically focalizing on wine label retrieval. In particular, Convolutional Neural Network (CNN) is utilized to filter out the impossible main-brands (manufacturers) for narrowing down the range of retrieval and the Speeded Up Robust Features (SURF) matching is improved by adopting the RANdom SAmple Consensus (RANSAC) mechanism and the modified Term Frequency-Inverse Document Frequency (TF-IDF) distance for the accurate retrieval of the sub-brand (item attribute under the manufacture). The experiments are conducted on a dataset containing approximately 548k images of wine labels with 17, 328 main-brands and 260, 579 subbrands. It is demonstrated by the experimental results that our proposed method can solve the wine label retrieval problem effectively and efficiently. Moreover, our proposed method is further evaluated on two pubic benchmarks of the object identification image retrieval tasks, Oxford Buildings Benchmark (Oxford5k) and the University of Kentucky of Indoor Things Benchmark (UKB), and achieves 88.3% mean average precision and 3.92 N-S score in Oxford5k and UKB, respectively. INDEX TERMS Image retrieval, object identification, wine label retrieval, CNN, SURF descriptor, filter out the impossible main-brands.