Assessment of Cognitive Load, Speech Communication Quality and Quality of Experience for spatial and non-spatial audio conferencing calls

Skowronek, Janto; Raake, Alexander

doi:10.1016/j.specom.2014.10.003

Cited by 16 publications

(13 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…No effects of presentation mode on any subjective construct turned out to be statistically significant. This stands in direct contrast to a number of previous studies examining "cocktail party" contexts (Raake et al, 2010;Koelewijn et al, 2015;Skowronek and Raake, 2015), which had reported higher perceived speech quality and speech intelligibility as well as reduced talker/speech identification effort following spatial speech presentation. The assessed subjective constructs appeared to be independent of spatialization, at least within the realized, probably less complex "turn-taking" listening scenario.…”

Section: Ratingcontrasting

confidence: 99%

“…The changes in voice similarity closely corresponded with experienced difficulty of TI, being "easy"/"very easy" under clean speech, but increasing to a small extent under degraded speech (Raake et al, 2010;Zekveld et al, 2014;Skowronek and Raake, 2015). Task difficulty should most probably depend on the amount of allocated information processing resources [i.e., the perceptual-cognitive load (Wickens, 2008); measurable, e.g., by pupillometry or EEG] to discriminate between the two talkers' voices, which was higher for degraded vs. clean speech; better talker voice discriminability would in turn ease TI based on voice recognition.…”

Section: Ratingmentioning

confidence: 82%

“…In general, presence of speech degradations should impede TI due to obscuring of individual talkers' voice characteristics. Thus, presentations of degraded (noisy, filtered) speech stimuli would be anticipated to reduce perceived speech quality and speech intelligibility as well as increase voice similarity and TI effort relative to clean stimuli (Leman et al, 2008;Raake et al, 2010;Skowronek and Raake, 2015).…”

Section: Speech Degradationmentioning

confidence: 99%

“…In applied settings, audio spatialization has been recommended as an important design feature of multi-talker speech displays that optimizes its effectiveness without having to attenuate or exclude non-target channels and losing potentially relevant information (Ericson et al, 2004). Related research work suggests strongest effects of spatial auditory information on TI performance if the number of talkers is high, talkers' voices are perceptually similar (e.g., due to same gender of talkers), and quality of transmitted speech is perceived as being low (e.g., due to limited transmission bandwidth) (Blum et al, 2010;Raake et al, 2010;Skowronek and Raake, 2015). Moreover, listening-only test scenarios have proven to be more sensitive to experimental manipulations of audio spatialization (and speech degradation) than conversational test scenarios (Skowronek and Raake, 2015).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

et al. 2022

View full text Add to dashboard Cite

This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers (“turn-taking” listening scenario). Previous research has demonstrated subjective benefits of audio spatialization with regard to speech intelligibility and talker-identification effort. So far, the deliberate activation of specific perceptual and cognitive processes by listeners to optimize their task performance remained largely unexamined. Spoken sentences selected as stimuli were either clean or degraded due to background noise or bandpass filtering. Stimuli were presented via three horizontally positioned loudspeakers: In a non-spatial mode, both talkers were presented through a central loudspeaker; in a spatial mode, each talker was presented through the central or a talker-specific lateral loudspeaker. Participants identified talkers via speeded keypresses and afterwards provided subjective ratings (speech quality, speech intelligibility, voice similarity, talker-identification effort). In the spatial mode, presentations at lateral loudspeaker locations entailed quicker behavioral responses, which were significantly slower in comparison to a talker-localization task. Under clean speech, response times globally increased in the spatial vs. non-spatial mode (across all locations); these “response time switch costs,” presumably being caused by repeated switching of spatial auditory attention between different locations, diminished under degraded speech. No significant effects of spatialization on subjective ratings were found. The results suggested that when listeners could utilize task-relevant auditory cues about talker location, they continued to rely on voice recognition instead of localization of talker sound sources as primary response strategy. Besides, the presence of speech degradations may have led to increased cognitive control, which in turn compensated for incurring response time switch costs.

show abstract

Section: Ratingcontrasting

confidence: 99%

Section: Ratingmentioning

confidence: 82%

Section: Speech Degradationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

et al. 2022

View full text Add to dashboard Cite

show abstract

“…At a later stage when the speech signal is being processed semantically, effort increases when the content topic is obscure and more context information needs to be recalled from memory to aid comprehension. Effort is also be influenced by the demands of concurrent tasks (Skowronek and Raake, 2014 ) as attention needs to be constantly reallocated depending on the dynamics of a subtask. This pathway is particularly relevant to the design of technology and multimedia applications where people increasingly consume multimedia while multi-tasking in day-to-day scenarios.…”

Section: Integrating Listening Effort Into Existing Qoe Frameworkmentioning

confidence: 99%

Listening Effort Informed Quality of Experience Evaluation

Sun

Hines

2022

Front. Psychol.

View full text Add to dashboard Cite

Perceived quality of experience for speech listening is influenced by cognitive processing and can affect a listener's comprehension, engagement and responsiveness. Quality of Experience (QoE) is a paradigm used within the media technology community to assess media quality by linking quantifiable media parameters to perceived quality. The established QoE framework provides a general definition of QoE, categories of possible quality influencing factors, and an identified QoE formation pathway. These assist researchers to implement experiments and to evaluate perceived quality for any applications. The QoE formation pathways in the current framework do not attempt to capture cognitive effort effects and the standard experimental assessments of QoE minimize the influence from cognitive processes. The impact of cognitive processes and how they can be captured within the QoE framework have not been systematically studied by the QoE research community. This article reviews research from the fields of audiology and cognitive science regarding how cognitive processes influence the quality of listening experience. The cognitive listening mechanism theories are compared with the QoE formation mechanism in terms of the quality contributing factors, experience formation pathways, and measures for experience. The review prompts a proposal to integrate mechanisms from audiology and cognitive science into the existing QoE framework in order to properly account for cognitive load in speech listening. The article concludes with a discussion regarding how an extended framework could facilitate measurement of QoE in broader and more realistic application scenarios where cognitive effort is a material consideration.

show abstract

Binaural Evaluation of Sound Quality and Quality of Experience

Raake

Wierstorf

2020

Modern Acoustics and Signal Processing

View full text Add to dashboard Cite

Assessment of Cognitive Load, Speech Communication Quality and Quality of Experience for spatial and non-spatial audio conferencing calls

Cited by 16 publications

References 11 publications

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

Listening Effort Informed Quality of Experience Evaluation

Binaural Evaluation of Sound Quality and Quality of Experience

Contact Info

Product

Resources

About