2018
DOI: 10.1121/1.5067488
|View full text |Cite
|
Sign up to set email alerts
|

Dereverberation binaural source separation using deep learning

Abstract: This paper reported a deep-learning based binaural separation using gammatone-frequency cepstral coefficient (GFCC) and multi-resolution cochleagram (MRCG) as spectral features and interaural time difference (ITD) and interaural level difference (ILD) as spatial features. A binary mask was estimated by deep neural network (DNN) binary classifier that used the features as a training data and ideal ratio mask (IRM) as a training target. In the experiment, a male speaker as a target speech at azimuth 0o and a fem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…With the development of Neural Networks (NNs), there has been tremendous improvement in a variety of speech recognition and acoustic signal processing tasks [16]. The binaural dereverberation models in [8], [16] and [17] uses artificial neural network (ANN) for binaural dereverberation preprocessing, the model in [18] uses the recurrent neural network (RNN) and interaural cues for speech enhancement in reverberant noisy conditions, while the models in [19] and [20] use the U-Net (a deep convolutional neural network (CNN)) for dereverberation, but these are monaural models.…”
Section: Introductionmentioning
confidence: 99%
“…With the development of Neural Networks (NNs), there has been tremendous improvement in a variety of speech recognition and acoustic signal processing tasks [16]. The binaural dereverberation models in [8], [16] and [17] uses artificial neural network (ANN) for binaural dereverberation preprocessing, the model in [18] uses the recurrent neural network (RNN) and interaural cues for speech enhancement in reverberant noisy conditions, while the models in [19] and [20] use the U-Net (a deep convolutional neural network (CNN)) for dereverberation, but these are monaural models.…”
Section: Introductionmentioning
confidence: 99%
“…They have been increasingly commonly used for learning a non-linear mappings between low-level features to high-level ones in both the audio and computer vision domains. They demonstrated promising results in a diverse range of audio signal processing tasks such as classification, de-reverberation [8,9], mixing [10] and synthesis [11]. Recently, DNNs have been used to model DRC [12] where an autoencoder is used to map an un-processed audio to the processed audio and is conditioned on the vector of DRC controls.…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning has demonstrated great utility at such diverse audio signal processing tasks as classification, 13,14 onset detection, 15 source separation, 16 event detection, 17 dereverberation, 18,19 denoising, 20 formant estimation, 21 remixing, 22 and synthesis, [23][24][25][26] as well as dynamic range compression to automate the mastering process. 27 In the area of audio component modeling, deep learning has been used to model tube amplifiers 28 and most recently guitar distortion pedals.…”
Section: Introductionmentioning
confidence: 99%