In this paper we propose a highly effective and scalable framework for recognizing logos in images. At the core of our approach lays a method for encoding and indexing the relative spatial layout of local features detected in the logo images. Based on the analysis of the local features and the composition of basic spatial structures, such as edges and triangles, we can derive a quantized representation of the regions in the logos and minimize the false positive detections. Furthermore, we propose a cascaded index for scalable multi-class recognition of logos.For the evaluation of our system, we have constructed and released a logo recognition benchmark which consists of manually labeled logo images, complemented with nonlogo images, all posted on Flickr. The dataset consists of a training, validation, and test set with 32 logo-classes. We thoroughly evaluate our system with this benchmark and show that our approach effectively recognizes different logo classes with high precision.
It is current state of knowledge that our neocortex consists of six layers [10]. We take this knowledge from neuroscience as an inspiration to extend the standard single-layer probabilistic Latent Semantic Analysis (pLSA) [13] to multiple layers. As multiple layers should naturally handle multiple modalities and a hierarchy of abstractions, we denote this new approach multilayer multimodal probabilistic Latent Semantic Analysis (mm-pLSA). We derive the training and inference rules for the smallest possible non-degenerated mm-pLSA model: a model with two leaf-pLSAs (here from two different data modalities: image tags and visual image features) and a single top-level pLSA node merging the two leaf-pLSAs. From this derivation it is obvious how to extend the learning and inference rules to more modalities and more layers. We also propose a fast and strictly stepwise forward procedure to initialize bottom-up the mm-pLSA model, which in turn can then be post-optimized by the general mm-pLSA learning algorithm. We evaluate the proposed approach experimentally in a query-by-example retrieval task using 50-dimensional topic vectors as image models. We compare various variants of our mm-pLSA system to systems relying solely on visual features or tag features and analyze possible pitfalls of the mm-pLSA training. It is shown that the best variant of the the proposed mm-pLSA system outperforms the unimodal systems by approximately 19% in our query-by-example task.
We present a scalable logo recognition technique based on feature bundling. Individual local features are aggregated with features from their spatial neighborhood into bundles. These bundles carry more information about the image content than single visual words. The recognition of logos in novel images is then performed by querying a database of reference images. We further propose a novel WGC-constrained ransac and a technique that boosts recall for object retrieval by synthesizing images from original query or reference images. We demonstrate the benefits of these techniques for both small object retrieval and logo recognition. Our logo recognition system clearly outperforms the current state-of-the-art with a recall of 83% at a precision of 99%.
We present a feature bundling technique based on min-Hashing. Individual local features are aggregated with features from their spatial neighborhood into bundles. These bundles carry more visual information than single visual words. The recognition of logos in novel images is then performed by querying a database of reference images. We further present a WGC-constrained ransac and a technique that boosts recall for object retrieval by synthesizing images from the original query image or reference images. We demonstrate the benefits of these techniques for both small object retrieval and logo recognition. Our logo recognition system clearly outperforms the current state-of-the-art with a recall of 83 % at a precision of 99 %.
The enormous growth of image databases calls for new techniques for fast and effective image search that scales with millions of images. Most importantly, the setting requires a compact but also descriptive image signature. Recently, the vector of aggregated local descriptors (VLAD) [1] has received much attention in large-scale image retrieval. In this paper we present two modifications for VLAD which improve the retrieval performance of the signature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.