Many novel multimedia systems and applications use visual sensor arrays. An important issue in designing sensor arrays is the appropriate placement of the visual sensors such that they achieve a predefined goal. In this paper we focus on the placement with respect to maximizing coverage or achieving coverage at a certain resolution. We identify and consider four different problems: maximizing coverage subject to a given number of cameras (a) or a maximum total price of the sensor array (b), optimizing camera poses given fixed locations (c), and minimizing the cost of a sensor array given a minimally required percentage of coverage (d). To solve these problems, we propose different algorithms. Our approaches can be subdivided into algorithms which give a global optimum solution and heuristics which solve the problem within reaonable time and memory consumption at the cost of not necessarily determining the global optimum. We also present a user-interface to enter and edit the spaces under analysis, the optimization problems as well as the other setup parameters. The different algorithms are experimentally evaluated and results are presented. The results show that the algorithms work well and are suited for different practical applications. For the final paper it is planned to have the user interface running as a web service.
It is current state of knowledge that our neocortex consists of six layers [10]. We take this knowledge from neuroscience as an inspiration to extend the standard single-layer probabilistic Latent Semantic Analysis (pLSA) [13] to multiple layers. As multiple layers should naturally handle multiple modalities and a hierarchy of abstractions, we denote this new approach multilayer multimodal probabilistic Latent Semantic Analysis (mm-pLSA). We derive the training and inference rules for the smallest possible non-degenerated mm-pLSA model: a model with two leaf-pLSAs (here from two different data modalities: image tags and visual image features) and a single top-level pLSA node merging the two leaf-pLSAs. From this derivation it is obvious how to extend the learning and inference rules to more modalities and more layers. We also propose a fast and strictly stepwise forward procedure to initialize bottom-up the mm-pLSA model, which in turn can then be post-optimized by the general mm-pLSA learning algorithm. We evaluate the proposed approach experimentally in a query-by-example retrieval task using 50-dimensional topic vectors as image models. We compare various variants of our mm-pLSA system to systems relying solely on visual features or tag features and analyze possible pitfalls of the mm-pLSA training. It is shown that the best variant of the the proposed mm-pLSA system outperforms the unimodal systems by approximately 19% in our query-by-example task.
Online image repositories such as Flickr contain hundreds of millions of images and are growing quickly. Along with that the needs for supporting indexing, searching and browsing is becoming more and more pressing. In this work we will employ the image content as a source of information to retrieve images. We study the representation of images by Latent Dirichlet Allocation (LDA) models for content-based image retrieval. Image representations are learned in an unsupervised fashion, and each image is modeled as the mixture of topics/object parts depicted in the image. This allows us to put images into subspaces for higher-level reasoning which in turn can be used to find similar images. Different similarity measures based on the described image representation are studied. The presented approach is evaluated on a real world image database consisting of more than 246,000 images and compared to image models based on probabilistic Latent Semantic Analysis (pLSA). Results show the suitability of the approach for large-scale databases. Finally we incorporate active learning with user relevance feedback in our framework, which further boosts the retrieval performance.
Many novel multimedia applications use visual sensor arrays. In this paper we address the problem of optimally placing multiple visual sensors in a given space. Our linear programming approach determines the minimum number of cameras needed to cover the space completely at a given sampling frequency. Simultaneously it determines the optimal positions and poses of the visual sensors. We also show how to account for visual sensors with different properties and costs if more than one kind is available, and report performance results.
Many novel multimedia systems and applications use visual sensor arrays. An important issue in designing sensor arrays is the appropriate placement of the visual sensors such that they achieve a predefined goal. In this paper we focus on the placement with respect to maximizing coverage or achieving coverage at a certain resolution. We identify and consider four different problems: maximizing coverage subject to a given number of cameras (a) or a maximum total price of the sensor array (b), optimizing camera poses given fixed locations (c), and minimizing the cost of a sensor array given a minimally required percentage of coverage (d). To solve these problems, we propose different algorithms. Our approaches can be subdivided into algorithms which give a global optimum solution and heuristics which solve the problem within reaonable time and memory consumption at the cost of not necessarily determining the global optimum. We also present a user-interface to enter and edit the spaces under analysis, the optimization problems as well as the other setup parameters. The different algorithms are experimentally evaluated and results are presented. The results show that the algorithms work well and are suited for different practical applications. For the final paper it is planned to have the user interface running as a web service.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.