Márton Marschall scite author profile

To achieve accurate spatial auditory perception, subjects typically require personal head-related transfer functions (HRTFs) and the freedom for head movements. Loudspeaker-based virtual sound environments allow for realism without individualized measurements. To study audio-visual perception in realistic environments, the combination of spatially tracked head mounted displays (HMDs), also known as virtual reality glasses, and virtual sound environments may be valuable. However, HMDs were recently shown to affect the subjects’ HRTFs and thus might influence sound localization performance. Furthermore, due to limitations of the reproduction of visual information on the HMD, audio-visual perception might be influenced. Here, a sound localization experiment was conducted both with and without an HMD and with a varying amount of visual information provided to the subjects. Furthermore, interaural time and level difference errors (ITDs and ILDs) as well as spectral perturbations induced by the HMD were analyzed and compared to the perceptual localization data. The results showed a reduction of the localization accuracy when the subjects were wearing an HMD and when they were blindfolded. The HMD-induced error in azimuth localization was found to be larger in the left than in the right hemisphere. When visual information of the limited set of source locations was provided, the localization error induced by the HMD was found to be negligible. Presenting visual information of hand-location and room dimensions showed better sound localization performance compared to the condition with no visual information. The addition of possible source locations further improved the localization accuracy. Also adding pointing feedback in form of a virtual laser pointer improved the accuracy of elevation perception but not of azimuth perception.

show abstract

Speech perception is similar for musicians and non-musicians across a wide range of conditions

Madsen

Marschall

Dau

et al. 2019

Sci Rep

View full text Add to dashboard Cite

It remains unclear whether musical training is associated with improved speech understanding in a noisy environment, with different studies reaching differing conclusions. Even in those studies that have reported an advantage for highly trained musicians, it is not known whether the benefits measured in laboratory tests extend to more ecologically valid situations. This study aimed to establish whether musicians are better than non-musicians at understanding speech in a background of competing speakers or speech-shaped noise under more realistic conditions, involving sounds presented in space via a spherical array of 64 loudspeakers, rather than over headphones, with and without simulated room reverberation. The study also included experiments testing fundamental frequency discrimination limens (F0DLs), interaural time differences limens (ITDLs), and attentive tracking. Sixty-four participants (32 non-musicians and 32 musicians) were tested, with the two groups matched in age, sex, and IQ as assessed with Raven’s Advanced Progressive matrices. There was a significant benefit of musicianship for F0DLs, ITDLs, and attentive tracking. However, speech scores were not significantly different between the two groups. The results suggest no musician advantage for understanding speech in background noise or talkers under a variety of conditions.

show abstract

Higher-Order Spatial Impulse Response Rendering: Investigating the Perceived Effects of Spherical Order, Dedicated Diffuse Rendering, and Frequency Resolution

McCormack¹,

Pulkki

Politis³

et al. 2020

J. Audio Eng. Soc.

View full text Add to dashboard Cite

This article details an investigation into the perceptual effects of different rendering strategies when synthesizing loudspeaker array room impulse responses (RIRs) using microphone array RIRs in a parametric fashion. The aim of this rendering task is to faithfully reproduce the spatial characteristics of a captured space, encoded within the input microphone array RIR (or the spherical harmonic RIR derived from it), over a loudspeaker array. For this study, a higherorder formulation of the Spatial Impulse Response Rendering (SIRR) method is introduced and subsequently employed to investigate the perceptual effects of the following rendering configurations: the spherical harmonic input order, frequency resolution, and utilizing dedicated diffuse stream rendering. Formal listening tests were conducted using a 64-channel loudspeaker array in an anechoic chamber, where simulated reference scenarios were compared against the outputs of different methods and rendering configurations. The test results indicate that dedicated diffuse stream rendering and higher analysis orders both yield noticeable perceptual improvements, particularly when employing problematic transient stimuli as input. Additionally, it was found that the frequency resolution employed during rendering has only a minor influence over the perceived accuracy of the reproduction in comparison to the other two tested attributes.

show abstract

Transforming a Six Month Release Cycle to Continuous Flow

Marschall¹

2007

View full text Add to dashboard Cite

The effect of spatial energy spread on sound image size and speech intelligibility

Ahrens

Marschall

Dau

2020

View full text Add to dashboard Cite

This study explored the relationship between perceived sound image size and speech intelligibility for sound sources reproduced over loudspeakers. Sources with varying degrees of spatial energy spread were generated using ambisonics processing. Young normal-hearing listeners estimated sound image size as well as performed two spatial release from masking (SRM) tasks with two symmetrically arranged interfering talkers. Either the target-to-masker ratio or the separation angle was varied adaptively. Results showed that the sound image size did not change systematically with the energy spread. However, a larger energy spread did result in a decreased SRM. Furthermore, the listeners needed a greater angular separation angle between the target and the interfering sources for sources with a larger energy spread. Further analysis revealed that the method employed to vary the energy spread did not lead to systematic changes in the interaural cross correlations. Future experiments with competing talkers using ambisonics or similar methods may consider the resulting energy spread in relation to the minimum separation angle between sound sources in order to avoid degradations in speech intelligibility.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.