2020
DOI: 10.1088/1741-2552/aba6f8
|View full text |Cite
|
Sign up to set email alerts
|

Linear versus deep learning methods for noisy speech separation for EEG-informed attention decoding

Abstract: Objective. A hearing aid’s noise reduction algorithm cannot infer to which speaker the user intends to listen to. Auditory attention decoding (AAD) algorithms allow to infer this information from neural signals, which leads to the concept of neuro-steered hearing aids. We aim to evaluate and demonstrate the feasibility of AAD-supported speech enhancement in challenging noisy conditions based on electroencephalography recordings. Approach. The AAD performance with a linear versus a deep neural network (DNN) bas… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 65 publications
0
17
0
Order By: Relevance
“…This can be achieved by applying the strategy of Section II-A and II-B in combination with an appropriate coding scheme (e.g., one-vs-one, one-vs-all), both for the CSP and LDA step, or, by approximating a joint diagonalization of all the class covariance matrices at once in the CSP block [20], and only applying a coding scheme to the LDA step. Note that also the stimulus reconstruction approach is applicable for various directions/speakers [11], [26]. In this paper, we adopt the popular one-vs-all approach in BCI research [20].…”
Section: Multiclass Csp Classificationmentioning
confidence: 99%
See 2 more Smart Citations
“…This can be achieved by applying the strategy of Section II-A and II-B in combination with an appropriate coding scheme (e.g., one-vs-one, one-vs-all), both for the CSP and LDA step, or, by approximating a joint diagonalization of all the class covariance matrices at once in the CSP block [20], and only applying a coding scheme to the LDA step. Note that also the stimulus reconstruction approach is applicable for various directions/speakers [11], [26]. In this paper, we adopt the popular one-vs-all approach in BCI research [20].…”
Section: Multiclass Csp Classificationmentioning
confidence: 99%
“…No a priori principal component analysis or change of filter basis as in [24] is used. The EEG and speech envelopes, which are extracted using a power-law operation with exponent 0.6 after subband filtering [7], are filtered between 1 and 9 Hz (thus mainly without α/β-activity, which was determined to be optimal for linear stimulus reconstruction [6], [7], [11], [26], [29]) and downsampled to 20 Hz. Note that this method employs an inherently different strategy for AAD than FB-CSP, by (in a way) reconstructing the attended speech envelope, rather than decoding the directional focus of attention.…”
Section: A Comparison With Stimulus Reconstruction Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…In fact, if the hearing aid was able to "know" which audio source the user is attending to, then it should be able to selectively enhance it. Therefore, combining AAD algorithms and hearing aids technologies, should lead to nextgeneration hearing aids allowing good performances in complex (or noisy) auditory environments (see for example: Das et al, 2016Das et al, , 2020Van Eyndhoven et al, 2017;Cantisani et al, 2020;Geirnaert et al, 2020).…”
Section: Application In Neuro-steered Hearing Aidsmentioning
confidence: 99%
“…The recent understanding of the selective auditory attention in the cocktail party problem and the advances of electrophysiological technologies make it possible to decode the auditory attention from EEG signals in the complex auditory scenarios. In the natural continuous speech streams, the extensively used auditory attention decoding (AAD) methods were based on the mapping functions between the speech envelope and the corresponding EEG responses via linear and non-linear computational models (e.g., Ding and Simon, 2012b ; O’Sullivan et al, 2015 ; Crosse et al, 2016 ; Ciccarelli et al, 2019 ; Das et al, 2020 ; Geravanchizadeh and Roushan, 2021 ). Specifically, the linear decoder models, such as the temporal response function (TRF), were widely used to decode auditory attention with reasonable accuracy under a wide range of signal-to-masker ratios (SMRs) ( Crosse et al, 2016 ).…”
Section: Introductionmentioning
confidence: 99%