Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1580
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Edge Detection for Tracking Vocal Tract Air-Tissue Boundaries in Real-Time Magnetic Resonance Images

Abstract: Recent developments in real-time magnetic resonance imaging (rtMRI) have enabled the study of vocal tract dynamics during production of running speech at high frame rates (e.g., 83 frames per second). Such large amounts of acquired data require scalable automated methods to identify different articulators (e.g., tongue, velum) for further analysis. In this paper, we propose a convolutional neural network with an encoderdecoder architecture to jointly detect the relevant air-tissue boundaries as well as to labe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(16 citation statements)
references
References 15 publications
0
16
0
Order By: Relevance
“…The proposed method has shown to provide sharp delineation of articulator boundary with readouts up to ~8 ms at 1.5 T, which is 3-fold longer than the current standard practice 15 and would provide 1.7-fold improvement in scan efficiency. This would allow for improved accuracy and precision of speech analysis beginning with boundary segmentation, [50][51][52] which is often impaired by blurring artifact. 16 It would also potentially be feasible to achieve higher temporal resolution using a longer readout with image quality comparable to a short readout (see Figure 7) or to use spiral readouts at higher field strengths such as 3 T, which is available on more sites and provides higher SNR.…”
Section: Discussionmentioning
confidence: 99%
“…The proposed method has shown to provide sharp delineation of articulator boundary with readouts up to ~8 ms at 1.5 T, which is 3-fold longer than the current standard practice 15 and would provide 1.7-fold improvement in scan efficiency. This would allow for improved accuracy and precision of speech analysis beginning with boundary segmentation, [50][51][52] which is often impaired by blurring artifact. 16 It would also potentially be feasible to achieve higher temporal resolution using a longer readout with image quality comparable to a short readout (see Figure 7) or to use spiral readouts at higher field strengths such as 3 T, which is available on more sites and provides higher SNR.…”
Section: Discussionmentioning
confidence: 99%
“…Deep learning-based image analysis was recently demonstrated in vocal tract shape analysis. An encoder-decoder CNN was demonstrated to automatically extract the vocal tract air-tissue boundaries [75,76].…”
Section: Deep Learningmentioning
confidence: 99%
“…To enable quantitative analysis of the information provided by these images, it is necessary to segment the anatomical features of interest, such as the vocal tract and articulators [1] , [2] , [3] , [4] , [5] . To avoid the time-consuming and expensive process of manual segmentation, several methods have been developed to perform this task semi or fully automatically [17] , [18] , [19] , [20] , [21] , [22] , [23] , [24] , [25] . One of these methods segmented the entire vocal tract [25] , while the others only labelled pixels at air-tissue boundaries and therefore created a partial contour for each articulator [17] , [18] , [19] , [20] , [21] , [22] , [23] , [24] .…”
Section: Introductionmentioning
confidence: 99%