2020
DOI: 10.1016/j.neucom.2019.09.106
|View full text |Cite
|
Sign up to set email alerts
|

Audio–visual domain adaptation using conditional semi-supervised Generative Adversarial Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 29 publications
(27 citation statements)
references
References 25 publications
0
27
0
Order By: Relevance
“…Common knowledge transfer in multi-modal methods include fine-tune well trained models to a specific type of signal (Vielzeuf et al, 2017;Yan et al, 2018;Huang et al, 2019;Ortega et al, 2019), or fine-tune different well-trained models to both speech and video signals (Ouyang et al, 2017;Zhang et al, 2017;Ma et al, 2019). Other usage of transfer learning for multi-modal methods includes leveraging the knowledge from one signal to another (e.g., video to speech) to reduce the potential bias (Athanasiadis et al, 2019). While these methods provide a promising performance, applying the more recent transfer learning methods, or utilizing more types of signals and leveraging knowledge between multiple signals, is likely to improve transfer learning and boost emotion recognition accuracy.…”
Section: Multi-modal Transfer Learning For Emotion Recognitionmentioning
confidence: 99%
“…Common knowledge transfer in multi-modal methods include fine-tune well trained models to a specific type of signal (Vielzeuf et al, 2017;Yan et al, 2018;Huang et al, 2019;Ortega et al, 2019), or fine-tune different well-trained models to both speech and video signals (Ouyang et al, 2017;Zhang et al, 2017;Ma et al, 2019). Other usage of transfer learning for multi-modal methods includes leveraging the knowledge from one signal to another (e.g., video to speech) to reduce the potential bias (Athanasiadis et al, 2019). While these methods provide a promising performance, applying the more recent transfer learning methods, or utilizing more types of signals and leveraging knowledge between multiple signals, is likely to improve transfer learning and boost emotion recognition accuracy.…”
Section: Multi-modal Transfer Learning For Emotion Recognitionmentioning
confidence: 99%
“…Similar tests with datasets CREMAD [21] & RVDSR [22] were done by researchers [23] with accuracy notes of 65%, 58.33%, [24] 52.52%, 47.11 %, [25] 74.0 %, 67.5 % and [26] 62.84% only for CREMAD [21] dataset.…”
Section: ( )mentioning
confidence: 55%
“…Our work is closely related to the reconstruction from audio recordings. For example, Oh et al [11] and Athanasiadis et al [12] have used speech to reconstruct the speakers' face. Chen et al [13] Hao et al [14] use conditional GANs and cycle GAN to achieve the cross-modal audio-visual generation of musical performances.…”
Section: Audio-visual Cross-modal Learningmentioning
confidence: 99%