2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015
DOI: 10.1109/icassp.2015.7177984
|View full text |Cite
|
Sign up to set email alerts
|

Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…is method can accurately locate the speaker in a dynamic acoustic scene. Zheng et al [62] proposed to use different values of SNR and noise to train DNNs, which achieved higher DOA accuracy at low SNR and improved the intersensor data ratio (ISDR) performance of a single acoustic vector sensor (AVS) in a noisy environment. Wang et al [60] proposed to use acoustic vector sensors (AVSs) to estimate the DOA of multiple voice sources through clustering of data ratios between sensors.…”
Section: Speech Doa Estimationmentioning
confidence: 99%
“…is method can accurately locate the speaker in a dynamic acoustic scene. Zheng et al [62] proposed to use different values of SNR and noise to train DNNs, which achieved higher DOA accuracy at low SNR and improved the intersensor data ratio (ISDR) performance of a single acoustic vector sensor (AVS) in a noisy environment. Wang et al [60] proposed to use acoustic vector sensors (AVSs) to estimate the DOA of multiple voice sources through clustering of data ratios between sensors.…”
Section: Speech Doa Estimationmentioning
confidence: 99%
“…If the task is regarded as regression, the output of ANN is the angle of arrival, and the number of output neurons is decided by the algorithm. Moreover, for some other ANN-based DOA estimation methods to complete the regression task [7], [17], post-processing means obtaining the DOA by combining the output features of ANN with special information about the received signals. Phase discontinuity: The azimuth ranges from 0 to 360 • .…”
Section: Post-processing For Cnnmentioning
confidence: 99%
“…These algorithms process the original data to learn simple features in the lower layers and more complicated features in the higher layers and have been successfully applied to DOA estimation. In [7], the restricted Boltzmann machines (RBMs) are employed for unsupervised pre-training, and then a supervised fine-tuning stage is used to accomplish the deep neural network (DNN) training. The system works well for the DOA estimation of the speech signals.…”
Section: Introductionmentioning
confidence: 99%
“…We note that similar issues exist in the acoustic signal processing and speech recognition community. The AoA of acoustic signals have been widely studied using temporal-spectral features from microphone arrays [25], [26]. Despite huge differences in propagating speed and wavelength, acoustic and RF AoA estimation share common challenges such as multipath, also known as reverberation, and interference of multiple sources [27].…”
Section: Introductionmentioning
confidence: 99%