2017
DOI: 10.1121/1.4990375
|View full text |Cite
|
Sign up to set email alerts
|

Modeling speech localization, talker identification, and word recognition in a multi-talker setting

Abstract: This study introduces a model for solving three different auditory tasks in a multi-talker setting: target localization, target identification, and word recognition. The model was used to simulate psychoacoustic data from a call-sign-based listening test involving multiple spatially separated talkers [Brungart and Simpson (2007). Percept. Psychophys. 69(1), 79–91]. The main characteristics of the model are (i) the extraction of salient auditory features (“glimpses”) from the multi-talker signal and (ii) the us… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
36
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 16 publications
(37 citation statements)
references
References 37 publications
1
36
0
Order By: Relevance
“…If the model by Josupeit and Hohmann (2017) can predict the SRM, this would support the statement of Schoenmaker and van de Par (2016) that BMLD does not contribute to SRM, at least in the investigated auditory scene with three competing talkers and no additional background noise. Another question of interest is whether the model word recognition scores are similar to those of the subjects.…”
Section: Introductionsupporting
confidence: 82%
See 4 more Smart Citations
“…If the model by Josupeit and Hohmann (2017) can predict the SRM, this would support the statement of Schoenmaker and van de Par (2016) that BMLD does not contribute to SRM, at least in the investigated auditory scene with three competing talkers and no additional background noise. Another question of interest is whether the model word recognition scores are similar to those of the subjects.…”
Section: Introductionsupporting
confidence: 82%
“…The standard deviation depended on the type of auditory feature used. In line with Josupeit and Hohmann (), they were set tonormalσP=false(20fcfalse)1forfc1400Hz0.2msforfc>1400Hzσ E = 1 dB, σ T = 25 μs, σ L = 1 dB. For each cns ‐bin and each auditory feature, the kernels were summed up and normalized, so that they formed a probability density function (PDF). In the following, these PDF functions are termed f X ( x , c , n , s | w ), where XP,E,T,L identifies the feature type, x identifies the function variable, and w identifies the word.…”
Section: Stimulimentioning
confidence: 99%
See 3 more Smart Citations