2009
DOI: 10.1250/ast.30.170
|View full text |Cite
|
Sign up to set email alerts
|

A flexible spectral modification method based on temporal decomposition and Gaussian mixture model

Abstract: Manipulating spectral structure often leads to degradation of speech quality, which is mainly due to insufficient smoothness of the modified spectra between frames, and ineffective spectral modification. This paper presents a new spectral modification method to improve the quality of modified speech. If frames are processed independently, discontinuous features may be generated. Therefore, a speech analysis technique called temporal decomposition (TD), which decomposes speech into event targets and event funct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…This can yield better fits and smoother formant trajectories. After that, we approximate the modified spectral envelope with a GMM [22]. The GMM parameters are estimated by minimizing a loss function of the observed spectral envelope , and the GMM approximated one expressed by…”
Section: A Spectral Envelope Approximation With Gaussian-markov Modelmentioning
confidence: 99%
“…This can yield better fits and smoother formant trajectories. After that, we approximate the modified spectral envelope with a GMM [22]. The GMM parameters are estimated by minimizing a loss function of the observed spectral envelope , and the GMM approximated one expressed by…”
Section: A Spectral Envelope Approximation With Gaussian-markov Modelmentioning
confidence: 99%
“…1). Formants: The frequencies, bandwidths and amplitudes at F1, F2 and F3 were estimated by linear predictive coding (LPC) and spectral Gaussian mixture model based spectra (spectral-GMM) [8]. F1 and F2 were used to produce the vowel space.…”
Section: Feature Extractionmentioning
confidence: 99%
“…Then, the event functions and the event timings are determined by DP and the NMF update rules of Eqs. (7)- (8). Note that the event vectors are not updated at this stage.…”
Section: Ntdmentioning
confidence: 99%
“…Temporal decomposition (TD) [2] can represent speech parameters as a set of temporally overlapped event functions and corresponding event vectors. TD has been used for many applications: speech coding [2,3], segmentation of speech signals [4,5], analysis of articulatory parameters [6,7] and modification for the speech spectrum [8] as well as for the speaking rhythm [9]. For these purposes, the event functions should be restricted to the range [0, 1].…”
Section: Introductionmentioning
confidence: 99%