ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
DOI: 10.1109/icassp49357.2023.10094797
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time MRI Video Synthesis from Time Aligned Phonemes with Sequence-to-Sequence Networks

Abstract: Real-Time Magnetic resonance imaging (rtMRI) of the midsagittal plane of the mouth is of interest for speech production research. In this work, we focus on estimating utterance level rtMRI video from the spoken phoneme sequence. We obtain time-aligned phonemes from forced alignment, to obtain frame-level phoneme sequences which are aligned with rtMRI frames. We propose a sequence-tosequence learning model with a transformer phoneme encoder and convolutional frame decoder. We then modify the learning by using i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
references
References 25 publications
0
0
0
Order By: Relevance