The success of many computer vision tasks lies in the ability to exploit the interdependency between different image modalities such as intensity and depth. Fusing corresponding information can be achieved on several levels, and one promising approach is the integration at a low level. Moreover, sparse signal models have successfully been used in many vision applications. Within this area of research, the socalled co-sparse analysis model has attracted considerably less attention than its well-known counterpart, the sparse synthesis model, although it has been proven to be very useful in various image processing applications.In this paper, we propose a co-sparse analysis model that is able to capture the interdependency of two image modalities. It is based on the assumption that a pair of analysis operators exists, so that the co-supports of the corresponding bimodal image structures are correlated. We propose an algorithm that is able to learn such a coupled pair of operators from registered and noise-free training data. Furthermore, we explain how this model can be applied to solve linear inverse problems in image processing and how it can be used for image registration tasks. This paper extends the work of some of the authors by two major contributions. Firstly, a modification of the learning process is proposed that a priori guarantees unit norm and zero-mean of the rows of the operator. This accounts for the intuition that contrast in image modalities carries the most information. Secondly, the model is used in a novel bimodal image registration algorithm which estimates the transformation parameters of unregistered images of different modalities.
Abstract-High definition video over IP based networks (IPTV) has become a mainstay in today's consumer environment. In most applications, encoders conforming to the H.264/AVC standard are used. But even within one standard, often a wide range of coding tools are available that can deliver a vastly different visual quality. Therefore we evaluate in this contribution different coding technologies, using different encoder settings of H.264/AVC, but also a completely different encoder like Dirac. We cover a wide range of different bitrates from ADSL to VDSL and different content, with low and high demand on the encoders. As PSNR is not well suited to describe the perceived visual quality, we conducted extensive subject tests to determine the visual quality. Our results show that for currently common bitrates, the visual quality can be more than doubled, if the same coding technology, but different coding tools are used.
The well-established static head-related transfer function (HRTF) measurement approaches using maximum length sequences and sine sweeps are compared with a recent HRTF estimation approach using normalized least mean square adaptive filters, which enables a continuous movement of the person to be measured during the recording of the excitation signal. By using continuous movement HRTF measurement, a huge amount of time for the individual HRTF estimation can be saved to create a dense HRTF database for headphone-based sound synthesis or applications such as crosstalk cancellation for loudspeakerbased sound synthesis. The different approaches are implemented and experimentally compared by objective and subjective evaluation.
Sound source localization algorithms determine the physical position of a sound source in respect to a listener. For practical applications, a localization algorithm design has to take into account real world conditions like multiple active sources, reverberation, and noise. The application can impose additional constraints on the algorithm, e.g., a requirement for low latency. This work defines the most important constraints for practical applications, introduces an algorithm, which tries to fulfill all requirements as good as possible, and compares it to state-of-the-art sound source localization approaches.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.