2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) 2016
DOI: 10.1109/mlsp.2016.7738817
|View full text |Cite
|
Sign up to set email alerts
|

A neural network based algorithm for speaker localization in a multi-room environment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
62
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 65 publications
(63 citation statements)
references
References 21 publications
0
62
0
1
Order By: Relevance
“…Further, methods [4,18,20,25] proposed to simultaneously detect DOAs of overlapping sound events by estimating the number of active sources from the data itself. Most methods used a classification approach, thereby estimating the source presence likelihood at a fixed set of angles, while [22,23] used a regression approach and let the DNN produce continuous output.…”
Section: B Sound Source Localizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Further, methods [4,18,20,25] proposed to simultaneously detect DOAs of overlapping sound events by estimating the number of active sources from the data itself. Most methods used a classification approach, thereby estimating the source presence likelihood at a fixed set of angles, while [22,23] used a regression approach and let the DNN produce continuous output.…”
Section: B Sound Source Localizationmentioning
confidence: 99%
“…Spectral power azi (Full) for each class Multiple CNN Circular Yiwere et al [21] ILD, cross-correlation azi and dist 1 FC Binaural × Ferguson et al [22] GCC, cepstrogram azi and dist (regression) 1 CNN Linear × Vesperini et al [23] GCC x and y (regression) 1 FC Distributed × Sun et al [24] GCC azi and ele 1 PNN Cartesian × Adavanne et al [25] Phase and magnitude spectrum azi and ele (Full) Multiple CRNN Generic ×…”
mentioning
confidence: 99%
“…By combining information from multiple microphone arrays, directions can be merged to obtain source locations. Given a microphone array signal from multiple microphones, direction estimation can be formulated in two ways: 1) by forming a fixed grid of possible directions, and by using multilabel classification to predict if there is an active source in a specific direction [115], or 2) by using regression to predict the directions [116] or spatial coordinates [117] of target sources. In addition to this categorization, differences in various deep learning methods for localization lie in the input features used, the network topology, and whether one or more sources are localized.…”
Section: Applicationsmentioning
confidence: 99%
“…To improve the robustness of DOA estimation, deep neural networks (DNNs) have been proposed to learn a mapping between signal features and a discretized DOA space [17][18][19][20][21]. Various features such as phasemaps [17,18] and GCC-PHAT [21] have been used as inputs.…”
Section: Introductionmentioning
confidence: 99%
“…Various features such as phasemaps [17,18] and GCC-PHAT [21] have been used as inputs. In [22], the cosines and sines of the frequency-wise phase differences between microphones, termed as cosine-sine interchannel phase difference (CSIPD) features, have been shown to perform as well as phasemaps for DOA estimation, despite their lower dimensionality.…”
Section: Introductionmentioning
confidence: 99%