2018 IEEE Spoken Language Technology Workshop (SLT) 2018
DOI: 10.1109/slt.2018.8639582
|View full text |Cite
|
Sign up to set email alerts
|

A Deep Learning Approach for Data Driven Vocal Tract Area Function Estimation

Abstract: In this paper we present a data driven vocal tract area function (VTAF) estimation using Deep Neural Networks (DNN). We approach the VTAF estimation problem based on sequence to sequence learning neural networks, where regression over a sliding window is used to learn arbitrary non-linear one-tomany mapping from the input feature sequence to the target articulatory sequence. We propose two schemes for efficient estimation of the VTAF; (1) a direct estimation of the area function values and (2) an indirect esti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…While these snapshots indicate that the proposed DTRN+AMCR model well captures the tract boundaries in all regions and under different articulatory configurations, a possible AoI classification error is also visible in the first snapshot due to a challenging calculation for the closest point of the contour to the landmark coordinate. Video samples of the tracked MRI contours are also available for online demonstration 3 .…”
Section: Subject Dependent Error Assessmentmentioning
confidence: 99%
See 1 more Smart Citation
“…While these snapshots indicate that the proposed DTRN+AMCR model well captures the tract boundaries in all regions and under different articulatory configurations, a possible AoI classification error is also visible in the first snapshot due to a challenging calculation for the closest point of the contour to the landmark coordinate. Video samples of the tracked MRI contours are also available for online demonstration 3 .…”
Section: Subject Dependent Error Assessmentmentioning
confidence: 99%
“…In this context, development of automatic algorithms to detect the landmarks defining the contours of the vocal tract is necessary. To name a few applications, VT contour estimation is used as a preprocessing step to obtain and study the time evolution of the vocal tract cross sectional area function in [3]. In another work, Toutios et al [4] propose a text-to-speech synthesis system using the estimated VT contours in rtMRI.…”
Section: Introductionmentioning
confidence: 99%