2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) 2023
DOI: 10.1109/icasspw59220.2023.10193719
|View full text |Cite
|
Sign up to set email alerts
|

Face-Dubbing++: LIP-Synchronous, Voice Preserving Translation Of Videos

Abstract: In this paper, we propose a neural end-to-end system for voice preserving, lip-synchronous translation of videos. The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker. The pipeline starts with automatic speech recognition including emphasis detection, followed by a translation model. The tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
references
References 63 publications
(89 reference statements)
0
0
0
Order By: Relevance