ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682655
|View full text |Cite
|
Sign up to set email alerts
|

Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-adversarial Training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
25
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 35 publications
(28 citation statements)
references
References 15 publications
3
25
0
Order By: Relevance
“…Randomization (using TrainS+ instead of TrainS) did not seem to further improve the ensembling results. While most unsupervised domain adaptation methods studied in this paper outperform lower bound methods which is contrary to the results presented in [11], adversarial methods are notoriously difficult to train [29]. The high variance of Lf in our results further supports this.…”
Section: Resultssupporting
confidence: 76%
See 3 more Smart Citations
“…Randomization (using TrainS+ instead of TrainS) did not seem to further improve the ensembling results. While most unsupervised domain adaptation methods studied in this paper outperform lower bound methods which is contrary to the results presented in [11], adversarial methods are notoriously difficult to train [29]. The high variance of Lf in our results further supports this.…”
Section: Resultssupporting
confidence: 76%
“…If not, D can learn the dissimilarities and the generator will unwantedly distort its predictions to satisfy (2). Constraining the output is common for domain adaptation in SSL with, for example, entropy methods [15,16] or weak supervision [11]. However, such methods can degrade the performance by boosting incorrect low confidence predictions [11] resulting in more false positives.…”
Section: Domain Adaptation For Multiple Sound Source 2d Localizationmentioning
confidence: 99%
See 2 more Smart Citations
“…The LOCATA dataset, presented as a part of IEEE-AASP Challenge on Acoustic Source Localization and Tracking, consists of audio recordings of one or two moving and up to four static sound sources, captured with a multitude of microphone arrays, with number of microphone per array ranging from 2 (binaural system using a dummy head) to 32 (Eigenmike EM32 spherical array). The shortcoming of the LOCATA dataset is that neither the room dimensions nor the distance of the origin of the coordinate system to a corner of the room is presented, which imposes a limitation of usage of the LOCATA dataset for evaluation of learning-based SSL methods, such as presented by He et al (2018He et al ( , 2019 or Chakrabarty and Habets (2019), where the model is trained on semisynthetic data, as it impossible to accurately simulate the environment matching the real-world. Also, the moving sound sources were the human subjects, walking in front of the microphone array and talking.…”
Section: Previous Workmentioning
confidence: 99%