2022
DOI: 10.1587/transinf.2021edp7020
|View full text |Cite
|
Sign up to set email alerts
|

Speaker-Independent Audio-Visual Speech Separation Based on Transformer in Multi-Talker Environments

Abstract: Speech separation is the task of extracting target speech while suppressing background interference components. In applications like video telephones, visual information about the target speaker is available, which can be leveraged for multi-speaker speech separation. Most previous multi-speaker separation methods are mainly based on convolutional or recurrent neural networks. Recently, Transformer-based Seq2Seq models have achieved state-of-the-art performance in various tasks, such as neural machine translat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
references
References 59 publications
0
0
0
Order By: Relevance