2021
DOI: 10.3390/biomimetics6010012
|View full text |Cite
|
Sign up to set email alerts
|

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

Abstract: Statistical parametric speech synthesis based on Hidden Markov Models has been an important technique for the production of artificial voices, due to its ability to produce results with high intelligibility and sophisticated features such as voice conversion and accent modification with a small footprint, particularly for low-resource languages where deep learning-based techniques remain unexplored. Despite the progress, the quality of the results, mainly based on Hidden Markov Models (HMM) does not reach thos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…This lack of intra-layer connections allows the weights of the connections between the visible and hidden nodes to be learned by Hinton’s contrastive divergence algorithm [ 11 ]. In conjunction with deep belief networks, these neural networks have been applied to address several problems, for example, natural language processing [ 12 ], image classification [ 13 ], forecasting time series [ 14 ], and voice synthesis [ 15 ], among others.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…This lack of intra-layer connections allows the weights of the connections between the visible and hidden nodes to be learned by Hinton’s contrastive divergence algorithm [ 11 ]. In conjunction with deep belief networks, these neural networks have been applied to address several problems, for example, natural language processing [ 12 ], image classification [ 13 ], forecasting time series [ 14 ], and voice synthesis [ 15 ], among others.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…Finally, Coto-Jiménez [ 6 ] presents a new bioinspired approach to postfiltering synthesized voices with the application of discriminative postfilters, with several long short-term memory (LSTM) deep neural networks. His work analyses the discriminative postfilters obtained using five voices, evaluated using three objective measures, Mel cepstral distance, and subjective tests.…”
mentioning
confidence: 99%