2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
DOI: 10.1109/waspaa.2019.8937177
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Contrastive Embeddings for Binaural Source Localization

Abstract: Recent data-driven approaches for binaural source localization are able to learn the non-linear functions that map measured binaural cues to source locations. This is done either by learning a parametric map directly using training data, or by learning a lowdimensional representation (embedding) of the binaural cues that is consistent with the source locations. In this paper, we adopt the second approach and propose a parametric embedding to map the binaural cues to a low-dimensional space, where localization … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…As shown in Table I, the utilized network architectures for sound source localization include fully connected (FC) neural network [14], [17]- [19], [28]- [30], CNN [20], [21], [23], recurrent neural network (RNN) [27] and CRNN [22], [25], [26]. As for the network input, the spatial features used for training include inter-channel difference features such as IPD [14], [28]- [30], IID [14], [19], [28], eigenvectors of spatial correlation matrix [18], intensity vector [22], GCC [17], [19], [27], and the spatial spectra such as spatial pseudospectrum [23]. The magnitude spectrum [21], [27], [30] can be also fed together with the spatial features to the network, but cannot be used solely.…”
Section: A Deep Learning Based Sound Source Localizationmentioning
confidence: 99%
See 1 more Smart Citation
“…As shown in Table I, the utilized network architectures for sound source localization include fully connected (FC) neural network [14], [17]- [19], [28]- [30], CNN [20], [21], [23], recurrent neural network (RNN) [27] and CRNN [22], [25], [26]. As for the network input, the spatial features used for training include inter-channel difference features such as IPD [14], [28]- [30], IID [14], [19], [28], eigenvectors of spatial correlation matrix [18], intensity vector [22], GCC [17], [19], [27], and the spatial spectra such as spatial pseudospectrum [23]. The magnitude spectrum [21], [27], [30] can be also fed together with the spatial features to the network, but cannot be used solely.…”
Section: A Deep Learning Based Sound Source Localizationmentioning
confidence: 99%
“…Li et al [12], [13] used a convolutive transfer function model and an inter-frame spectral subtraction algorithm to separately suppress reverberation and noise in order to identify the DP-RTF. Tang et al [28] designed a siamese architecture to learn a low-dimensional representation of the localization cues that is consistent with the source location. Pak et al [29] and Cheng et al [30] trained DNN models to enhance the interference-contaminated IPD on the sinusoidal tracks.…”
Section: B Enhancement Of Signal Spectra and Localization Featuresmentioning
confidence: 99%
“…With the development of deep learning techniques, lots of data-driven sound source localization works are built in a supervised manner [1]. According to the role of the deep learning model plays, these methods are classified into four categories, namely signal-to-location [2], feature-to-location [3,4], spatial spectrum-to-location [5], and feature-to-feature [6,7]-based methods. Among these methods, the feature-to-feature-based method is simple and effective for improving the performance of sound source localization in noisy and reverberant environments, as it is the data driven and the extracted features can adapt to various acoustic conditions.…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning has been successfully applied to the localization task recently. Under the dual-stage localization framework, deep neural network (DNN) can be used to either extract localization features [3,4], or build the mapping from the localization features to source location [5,6]. Commonly used localization feature includes inter-channel time difference (ITD) [7], inter-channel phase difference (IPD) [8], inter-channel intensity difference (IID), relative transfer function (RTF) [9,10], etc.…”
Section: Introductionmentioning
confidence: 99%