Abstract-Virtual 3-D sound can be easily delivered to a listener by binaural audio signals that are reproduced via headphones, which guarantees that only the correct signals reach the corresponding ears. Reproducing the binaural audio signal by two or more loudspeakers introduces the problems of crosstalk on the one hand, and, of reverberation on the other hand. In crosstalk cancellation, the audio signals are fed through a network of prefilters prior to loudspeaker reproduction to ensure that only the designated signal reaches the corresponding ear of the listener. Since room impulse responses are very sensitive to spatial mismatch, and since listeners might slightly move while listening, robust designs are needed. In this paper, we present a method that jointly handles the three problems of crosstalk, reverberation reduction, and spatial robustness with respect to varying listening positions for one or more binaural source signals and multiple listeners. The proposed method is based on a multichannel room impulse response reshaping approach by optimizing a -norm based criterion. Replacing the well-known least-squares technique by a -norm based method employing a large value for allows us to explicitly control the amount of crosstalk and to shape the remaining reverberation effects according to a desired decay.
The purpose of room impulse response reshaping is to reduce reverberation and thus to improve the perceived quality of the received signal by prefiltering the source signal before it is played with a loudspeaker. The filter design is usually carried out by solving an optimization problem.There are, in general, two possibilities to improve the robustness of the equalizers against small movements of the listener and/or receiver; namely multi-position approaches or the utilization of a regularization term. Multi-position approaches suffer from the extensive effort of measuring multiple room impulse responses. Stochastic models may describe the average system error due to spatial mismatch, but only quadratic penalty terms have been considered so far.In this contribution we propose a third method to improve robustness against spatial misalignment. We combine the two approaches by generating multiple realizations of distorted room impulse responses and feeding them into the multiposition algorithm. Based on our previous work, we propose a model to capture the perturbations with respect to the assumed displacement.Index Terms-room impulse response, RIR reshaping, p-norm, spatial robustness.
In this contribution, six different single-channel dereverberation algorithms are evaluated subjectively in terms of speech intelligibility and speech quality. In order to study the influence of the dereverberation algorithms on speech intelligibility, speech reception thresholds in noise were measured for different reverberation times. The quality ratings were obtained following the ITU-T P.835 recommendations (with slight changes for adaptation to the problem of dereverberation) and included assessment of the attributes: reverberant, colored, distorted, and overall quality. Most of the algorithms improved speech intelligibility for short as well as long reverberation times compared to the reverberant condition. The best performance in terms of speech intelligibility and quality was observed for the regularized spectral inverse approach with pre-echo removal. The overall quality of the processed signals was highly correlated with the attribute reverberant or/and distorted. To generalize the present outcomes, further studies are needed to account for the influence of the estimation errors
This paper reports on the evaluation of several objective quality measures for predicting the quality of the dereverberated speech signals. The correlations between subjective quality assessment for single-channel dereverberation techniques and objective speech quality as well as speech intelligibility measures are analyzed and discussed. Six different single-channel dereverberation algorithms were included in the evaluation to account for different types of distortions. The subjective quality was assessed along the four attributes reverberant, colored, distorted and overall quality following the recommendations of ITU-T P.835. The objective measures included system-based, i.e. channel-based, as well as signal-based measures
By using room impulse response shortening and reshaping it is possible to reduce the reverberation effects and therefore improve the perceived quality. This may be achieved by a prefilter that modifies the overall impulse response to have a faster decay. The traditional filter shortening approach using least-squares methods is fast and directly computable, but it suffers from late echoes. Newer approaches using the p-norm overcome this drawback but are computationally very demanding, as the optimization process uses a gradient-descent approach with slow convergence. In this work we propose a modification to this approach that results in a significantly faster convergence. With this modification, the algorithm is less likely to be trapped in a local minimum and therefore also leads to a better convergence point. The method will be demonstrated on simulated and real-world room impulse responses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.