Contribution of Vocal Tract and Glottal Source Spectral Cues in the Generation of Acted Happy and Aggressive Spanish Vowels

Freixes, Marc; Socoró, Joan Claudi; Álías, Francesc

doi:10.3390/app12042055

Cited by 1 publication

(2 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Regarding the addition of expressiveness to the numerical generation of voice, previous works have been developed using ST-QCP to analyse the characteristics of GS and VT in aggressive and happy female vowels [9,14]. However, the results obtained in the present study suggest that maybe it would be better to use QCP without the spectral tilt correction when dealing with female speech.…”

Section: Discussionmentioning

confidence: 71%

“…Additionally, the manipulation of the vocal tract characteristics using simulations based on finite element methods (FEM) has enabled the production of effects such as the singing formant in 3D-based articulatory voice generation [12]. Therefore, from these works, it can be concluded that, for the production of expressive speech, a proper model and adjustment of the vocal tract response and the glottal source signal is of paramount importance, as is considering their varying relevance depending on the target speaking style [13,14].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels

Freixes,

Joglar-Ongay,

Socoró

et al. 2023

Applied Sciences

Self Cite

View full text Add to dashboard Cite

Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.

show abstract

Section: Discussionmentioning

confidence: 71%