Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition

Kim, Daeha; Song, Byung Cheol

doi:10.1007/978-3-031-19778-9_11

Cited by 9 publications

(1 citation statement)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Shu et al [92] proposed an effective self-supervised CL framework for FER. In view of two concerns that arousal-valence-based FER approaches have not yet dealt with: the key for feature learning of facial emotions and the facial emotion-aware features extraction, Kim and Song [93] incorporated visual perception ability into representation learning for the first time to focus on semantic regions that are important for emotion representation.…”

Section: B Trends In Visual Sentiment Analysismentioning

confidence: 99%

Sentiment Analysis: Comprehensive Reviews, Recent Advances, and Open Challenges

Sun

Long

et al. 2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Sentiment analysis (SA) aims to understand the attitudes and views of opinion holders with computers. Previous studies have achieved significant breakthroughs and extensive applications in the past decade, such as public opinion analysis and intelligent voice service. With the rapid development of deep learning, SA based on various modalities has become a research hotspot. However, only individual modality has been analyzed separately, lacking a systematic carding of comprehensive SA methods. Meanwhile, few surveys covering the topic of multimodal SA (MSA) have been explored yet. In this article, we first take the modality as the thread to design a novel framework of SA tasks to provide researchers with a comprehensive understanding of relevant advances in SA. Then, we introduce the general workflows and recent advances of single-modal in detail, discuss the similarities and differences of single-modal SA in data processing and modeling to guide MSA, and summarize the commonly used datasets to provide guidance on data and methods for researchers according to different task types. Next, a new taxonomy is proposed to fill the research gaps in MSA, which is divided into multimodal representation learning and multimodal data fusion. The similarities and differences between these two methods and the latest advances are described in detail, such as dynamic interaction between multimodalities, and the multimodal fusion technologies are further expanded. Moreover, we explore the advanced studies on multimodal alignment, chatbots, and Chat Generative Pre-trained Transformer (ChatGPT) in SA. Finally, we discuss the open research challenges of MSA and provide four potential aspects to improve future works, such as cross-modal contrastive learning and multimodal pretraining models.

show abstract

Section: B Trends In Visual Sentiment Analysismentioning

confidence: 99%