2022 IEEE Spoken Language Technology Workshop (SLT) 2023
DOI: 10.1109/slt54892.2023.10023284
|View full text |Cite
|
Sign up to set email alerts
|

AVSE Challenge: Audio-Visual Speech Enhancement Challenge

Abstract: Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a commo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 23 publications
0
6
0
Order By: Relevance
“…To create the AV dataset used in the evaluation, we selected a set of TED and TEDx videos 1 of public lectures delivered by a single speaker. Details about the train, dev and eval AV datasets can be found in [14]. After selecting the videos, we extracted sentences based on the manual transcriptions of the talks.…”
Section: Audio-visual Evaluation Datasetmentioning
confidence: 99%
See 2 more Smart Citations
“…To create the AV dataset used in the evaluation, we selected a set of TED and TEDx videos 1 of public lectures delivered by a single speaker. Details about the train, dev and eval AV datasets can be found in [14]. After selecting the videos, we extracted sentences based on the manual transcriptions of the talks.…”
Section: Audio-visual Evaluation Datasetmentioning
confidence: 99%
“…After validating our proposed method we conducted a largescale evaluation of speech enhancement systems submitted to [14]. We evaluated nine systems (including the baseline model), and the original (i.e., not enhanced) samples.…”
Section: Evaluation Of Avse Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…Integrating them seamlessly can be a significant challenge to achieve a comprehensive and effective AV HAT technology for individuals with hearing loss. The new Audio-Visual Speech Enhancement (AVSE) Challenge takes the first step toward accomplishing this by setting benchmarks in this research area [9].…”
Section: Complexity Of Integrating Multiple Technologiesmentioning
confidence: 99%
“…1 https://www.who.int/news-room/fact-sheets/deta il/deafness-and-hearing-loss processing [7]. The multi-modal aspect of AV hearing assistive technology (HAT) may provide a range of benefits for users, including the capability to selectively enhance speech based on the user's eye gaze [8,9] and lipreading-based technologies [10]. Given that speech enhancement in noisy environments is especially challenging, adding the visual aspect to hearing aid algorithms has been predicted to result in more reliable performance [11].…”
Section: Introductionmentioning
confidence: 99%