2012
DOI: 10.1016/j.specom.2011.08.003
|View full text |Cite
|
Sign up to set email alerts
|

Data-driven voice source waveform analysis and synthesis

Abstract: The paper presents a voice source waveform modeling techniques based on principal component analysis (PCA) and Gaussian mixture modeling (GMM). The voice source is obtained by inverse-filteirng speech with the estimated vocal tract filter. This decomposition is useful in speech analysis, synthesis, recognition and coding. Existing models of the voice source signal are based on function-fitting or physically motivated assumptions and although they are well defined, estimation of their parameters is not well und… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
8
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 44 publications
2
8
0
Order By: Relevance
“…This behaviour is similar to that of the principal components of the VS proposed in Ref. 16. The gross shape of the ILPR is captured by only 12 coefficients.…”
Section: Pitch Synchronous Discrete Cosine Transform and The Number Osupporting
confidence: 83%
“…This behaviour is similar to that of the principal components of the VS proposed in Ref. 16. The gross shape of the ILPR is captured by only 12 coefficients.…”
Section: Pitch Synchronous Discrete Cosine Transform and The Number Osupporting
confidence: 83%
“…This process is called inverse filtering [Fritzell 1992, Walker and Murphy 2007, Drugman et al 2012, Gudnason et al 2012. The vocal tract (throat, mouth and in some cases nose) forms the tube, which is characterized by its resonances.…”
Section: Feature Extractionmentioning
confidence: 99%
“…A set of the first PCA components were then modelled with Gaussian Mixture Models (GMMs). The obtained GMMs were shown by (Gudnason et al (2012)) to enable parameterizing source features, such as non-flatness of the closed phase, that traditional approaches fail to model. Finally, a new parameterization method of the voice source was recently proposed by (Kane and Gobl (2013a)) by adopting dynamic programming using which the settings originating from a manual voice source analysis could be combined into an automatic, machine-based analysis.…”
Section: Glottal Source Parameterizationmentioning
confidence: 99%
“…Instead of fitting the computed glottal flow with a pre-defined function, a parameterization scheme based on a data-driven approach was recently proposed by (Gudnason et al (2012)). More specifically, a glottal flow estimate was first computed with IAIF and the obtained waveform was then processed with Principal Component Analysis (PCA) in order to achieve dimensionality reduction.…”
Section: Glottal Source Parameterizationmentioning
confidence: 99%