Interaction between voice-gender difference and spatial separation in release from masking in multi-talker listening environments

Oh, Yonghee; Bridges, Sarah E.; Schoenfeld, Hannah; Layne, Allison O.; Eddins, David A.

doi:10.1121/10.0005831

Cited by 9 publications

(19 citation statements)

References 26 publications

(44 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With the spatially separated maskers, SRTs worsened by 3.2 dB with 2-mm inter-aural mismatch, much less than the drop of 7.5 dB for the dichotic masker presentation in Xu et al [ 10 ]. This is consistent with the observation that dichotic presentation may overestimate spatial benefits for the segregation of competing speech [ 13 – 17 , 19 ]. Accordingly, the 1.8 dB of spatial masking release was also much less than the 7.6 dB observed in Xu et al (2020).…”

Section: Discussionsupporting

confidence: 91%

“…In the case of a headphone presentation, the difference between diotic and dichotic masker presentation have been used to estimate spatial masking release [ 10 ]. However, dichotic masker presentation via headphones may overestimate spatial masking release in NH listeners compared to presentations in a sound field with symmetrically placed maskers [ 13 – 17 ]. Head-related transfer functions (HRTFs) may help to compensate for differences in spatial masking release between headphone and sound field presentations [ 18 ].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Effects of tonotopic matching and spatial cues on segregation of competing speech in simulations of bilateral cochlear implants

Thomas

Willis²,

Galvin³

et al. 2022

PLoS ONE

View full text Add to dashboard Cite

In the clinical fitting of cochlear implants (CIs), the lowest input acoustic frequency is typically much lower than the characteristic frequency associated with the most apical electrode position, due to the limited electrode insertion depth. For bilateral CI users, electrode positions may differ across ears. However, the same acoustic-to-electrode frequency allocation table (FAT) is typically assigned to both ears. As such, bilateral CI users may experience both intra-aural frequency mismatch within each ear and inter-aural mismatch across ears. This inter-aural mismatch may limit the ability of bilateral CI users to take advantage of spatial cues when attempting to segregate competing speech. Adjusting the FAT to tonotopically match the electrode position in each ear (i.e., increasing the low acoustic input frequency) is theorized to reduce this inter-aural mismatch. Unfortunately, this approach may also introduce the loss of acoustic information below the modified input acoustic frequency. The present study explored the trade-off between reduced inter-aural frequency mismatch and low-frequency information loss for segregation of competing speech. Normal-hearing participants were tested while listening to acoustic simulations of bilateral CIs. Speech reception thresholds (SRTs) were measured for target sentences produced by a male talker in the presence of two different male talkers. Masker speech was either co-located with or spatially separated from the target speech. The bilateral CI simulations were produced by 16-channel sinewave vocoders; the simulated insertion depth was fixed in one ear and varied in the other ear, resulting in an inter-aural mismatch of 0, 2, or 6 mm in terms of cochlear place. Two FAT conditions were compared: 1) clinical (200–8000 Hz in both ears), or 2) matched to the simulated insertion depth in each ear. Results showed that SRTs were significantly lower with the matched than with the clinical FAT, regardless of the insertion depth or spatial configuration of the masker speech. The largest improvement in SRTs with the matched FAT was observed when the inter-aural mismatch was largest (6 mm). These results suggest that minimizing inter-aural mismatch with tonotopically matched FATs may benefit bilateral CI users’ ability to segregate competing speech despite substantial low-frequency information loss in ears with shallow insertion depths.

show abstract

Section: Discussionsupporting

confidence: 91%

Section: Introductionmentioning

confidence: 99%

Effects of tonotopic matching and spatial cues on segregation of competing speech in simulations of bilateral cochlear implants

Thomas

Willis²,

Galvin³

et al. 2022

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…Spatial cues are particularly useful when a speech target and speech maskers are qualitatively similar, same gender for instance, due to non-energetic factors. In such cases, SRM can be as much as 10 dB or 12 dB at the widest separations for an identification task ( Marrone et al , 2008a ; Oh et al , 2021 ). In the present study, maskers were the opposite gender of the target because the running speech materials lacked a comparable voice identifier, like a call sign in the CRM corpus, but there was still between 6.7 and 8.3 dB SRM on average with 30° separation, depending on whether a background babble was present or not.…”

Section: Discussionmentioning

confidence: 99%

Defining functional spatial boundaries using a spatial release from masking task

Ozmeral

Higgins

2022

JASA Express Letters

View full text Add to dashboard Cite

The classic spatial release from masking (SRM) task measures speech recognition thresholds for discrete separation angles between a target and masker. Alternatively, this study used a modified SRM task that adaptively measured the spatial-separation angle needed between a continuous male target stream (speech with digits) and two female masker streams to achieve a specific SRM. On average, 20 young normal-hearing listeners needed less spatial separation for 6 dB release than 9 dB release, and the presence of background babble reduced across-listener variability on the paradigm. Future work is needed to better understand the psychometric properties of this adaptive procedure.

show abstract

“…For each stimulus condition, half of the sentences were male talkers and another half were female talkers, and subjects were tested with those different talker gender conditions in separate sessions. It is noted that previous studies reported that the target talker’s gender could affect speech recognition performance in multi-talker listening situations (e.g., Oh et al, 2021 ). In the current study, the male and female voices were used as one factor to explain this gender-specific difference in multisensory speech perception benefits.…”

Section: Methodsmentioning

confidence: 99%

Multisensory benefits for speech recognition in noisy environments

Schwalm

Kalpin

2022

Front. Neurosci.

Self Cite

View full text Add to dashboard Cite

A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile domain. Twenty adults participated in speech recognition measurements in four different sensory modalities (AO, auditory-only; AV, auditory-visual; AT, auditory-tactile; AVT, auditory-visual-tactile). The target sentences were fixed at 65 dB sound pressure level and embedded within a simultaneous speech-shaped noise masker of varying degrees of signal-to-noise ratios (−7, −5, −3, −1, and 1 dB SNR). The amplitudes of both abstract visual and vibrotactile stimuli were temporally synchronized with the target speech envelope for comparison. Average results showed that adding temporally-synchronized multimodal cues to the auditory signal did provide significant improvements in word recognition performance across all three multimodal stimulus conditions (AV, AT, and AVT), especially at the lower SNR levels of −7, −5, and −3 dB for both male (8–20% improvement) and female (5–25% improvement) talkers. The greatest improvement in word recognition performance (15–19% improvement for males and 14–25% improvement for females) was observed when both visual and tactile cues were integrated (AVT). Another interesting finding in this study is that temporally synchronized abstract visual and vibrotactile stimuli additively stack in their influence on speech recognition performance. Our findings suggest that a multisensory integration process in speech perception requires salient temporal cues to enhance speech recognition ability in noisy environments.

show abstract

Interaction between voice-gender difference and spatial separation in release from masking in multi-talker listening environments

Cited by 9 publications

References 26 publications

Effects of tonotopic matching and spatial cues on segregation of competing speech in simulations of bilateral cochlear implants

Effects of tonotopic matching and spatial cues on segregation of competing speech in simulations of bilateral cochlear implants

Defining functional spatial boundaries using a spatial release from masking task

Multisensory benefits for speech recognition in noisy environments

Contact Info

Product

Resources

About