2021
DOI: 10.1109/tmm.2021.3068565
|View full text |Cite
|
Sign up to set email alerts
|

Intelligibility Enhancement Via Normal-to-Lombard Speech Conversion With Long Short-Term Memory Network and Bayesian Gaussian Mixture Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…Compared with previous studies, the deep neural network is used to map high dimensions instead of Gaussian model, and tilt correction module is creatively added to further reduce the mapping error of formant amplitude. Considering the short-term correlation of voice and the accuracy of spectral slope mapping, they also proposed an intelligibility enhancement algorithm based on LSTM and BGMM for the change from normal speech to lombard speech [21] . Using LSTM to map the first 20-dim MCC features to characterize the spectral tilt more accurately.…”
Section: Transformation Methods Based On Feature Mappingmentioning
confidence: 99%
“…Compared with previous studies, the deep neural network is used to map high dimensions instead of Gaussian model, and tilt correction module is creatively added to further reduce the mapping error of formant amplitude. Considering the short-term correlation of voice and the accuracy of spectral slope mapping, they also proposed an intelligibility enhancement algorithm based on LSTM and BGMM for the change from normal speech to lombard speech [21] . Using LSTM to map the first 20-dim MCC features to characterize the spectral tilt more accurately.…”
Section: Transformation Methods Based On Feature Mappingmentioning
confidence: 99%
“…Yan et al used LSTM to construct an end-to-end neural network framework for machine translation and introduced local attention mechanism into the model to improve translation quality [15]. Li et al used the reinforcement learning actor-critic training network to evaluate the value of the output notes of LSTM network, so as to update the generation strategy of LSTM network, and the generated music has a stable structure and more style [16]. Saqib et al conducted two experiments on two-way LSTM and oneway LSTM on speech corpus, and found that two-way LSTM was superior to one-way LSTM and conventional RNN [17].…”
Section: Research Status Of Lstmmentioning
confidence: 99%