ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682576
|View full text |Cite
|
Sign up to set email alerts
|

Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 43 publications
(39 citation statements)
references
References 18 publications
0
35
0
Order By: Relevance
“…Recently, several researches have shown that integrating the multi-channel information collected by a microphone array can improve the mask estimation of the reference channel and lead to better speech separation. It has been found in previous research that the complex ratio masks (CRMs) outperform both the binary masks (BMs) and realvalue ratio masks (RMs) on speech separation [26], [43] and enhancement [44] tasks. For this reason, the CM based TF masking approach is implemented in this work.…”
Section: B Tf Maskingmentioning
confidence: 98%
See 3 more Smart Citations
“…Recently, several researches have shown that integrating the multi-channel information collected by a microphone array can improve the mask estimation of the reference channel and lead to better speech separation. It has been found in previous research that the complex ratio masks (CRMs) outperform both the binary masks (BMs) and realvalue ratio masks (RMs) on speech separation [26], [43] and enhancement [44] tasks. For this reason, the CM based TF masking approach is implemented in this work.…”
Section: B Tf Maskingmentioning
confidence: 98%
“…In mask-based MVDR approaches, the deep neural networks are used to estimate the real-value [4], [5], [23] or complex [26] TF masks of the target speech M y (t, f ) and other interfering sources M n (t, f ) respectively. The PSD matrices corresponding to each source can be calculated with the estimated TF masks shown as follows:…”
Section: E Mask-based Mvdrmentioning
confidence: 99%
See 2 more Smart Citations
“…Speech enhancement is useful in many applications, such as speech recognition [1,2,3] and hearing aids [4,5]. Recently, the research community has witnessed a shift in methodology from conventional signal processing methods [6,7] to data-driven enhancement approaches, particularly those based on deep learning paradigms [8,9,3,10,11].…”
Section: Introductionmentioning
confidence: 99%