Learning multi-level representations for affective image recognition

Zhang, Hao; Xu, Dan; Luo, Gaifang; He, Kangjian

doi:10.1007/s00521-022-07139-y

Cited by 13 publications

(9 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Unlike their method of fine-tuning existing deep architectures, You et al [11] design a progressively learned CNN to classify image sentiment. Some researchers also combine different hierarchical features in deep models for sentiment classification, from global to local perspectives [12,13] . It is also worth mentioning that in the recent work of You et al [14] , AR [5] , and R-CNNGSR [6] , information from different image regions is used for image sentiment analysis.…”

Section: Deep Representations For Emotion Recognitionmentioning

confidence: 99%

“…ArtPhoto Abstract Twitter I EmotionROI 5-agree 4-agree 3-agree SentiBank [27] 67.74 64.95 71.32 68.28 66.63 66.18 PAEF [28] 67.85 70.05 72.90 69.61 67.92 75.24 DeepSentiBank [30] 68.73 71.19 76.35 70.15 71.25 70.11 PCNN [11] 70.96 70.84 82.54 76.52 76.36 73.58 VGG-16 w/o Fine-tuning [32] 67.61 68.86 83.44 78.67 75.49 72.25 VGG-16 [32] 70.09 72.48 84.35 82.26 76.75 77.02 SentiNet-A [29] --85.10 80.70 77.7 -AR [5] 74.80 76.03 88.65 85.10 81.06 81.26 R-CNNGSR [6] 75.02 75.89 ---81.36 MLM w/o 𝐿 𝑐𝑖𝑠 [13] 74.32 76.82 87.19 82.95 80.42 81.53 MLM [13] 75…”

Section: Modelmentioning

confidence: 99%

“…SentiNet-A [29] is an attention-based method, and AR [5] and R-CNNGSR [6] combines object detection to discover affective regions. MLM [13] combines different levels of features for emotion recognition by studying the effect of different levels of features on the performance of emotion recognition. In addition, they also propose a class imbalance loss to solve the class imbalance issue by optimizing the deep model.…”

Section: Comparison To State Of the Artsmentioning

confidence: 99%

“…Several widely used hierarchical models in the vision domain are experimented with, including VGG16 [31] , AlexNet [32] and ResNet50 [24] , which are initialized with the pre-trained parameters on ImageNet and then fine-tuned on FI. Researchers then developed specific deep networks based on the CNN backbone to handle visual sentiment analysis, including AR [5] , MAP [19] , and MLM [13] . Compared to them, our method achieves an overall performance improvement.…”

Section: Comparison To State Of the Artsmentioning

confidence: 99%

See 3 more Smart Citations

Exploring affective image representation with visual attention and aesthetic fusion

JiXiong,

Hao,

KangJian

et al. 2023

Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022)

View full text Add to dashboard Cite

Affective image analysis aims to understand the sentiment of different images. The challenge is to develop a discriminative representation that bridges the affective gap between low-level features and high-level emotions. Most existing studies bridge the gap by designing deep models carefully to learn global representations in one shot directly or identify image emotion by extracting features at different levels in the model. They ignore that both local regions of an image and relationships between them impact emotional representation learning. This paper develops an affective image analysis method based on the aesthetic fusion hybrid attention network (AFHA). A modular hybrid attention block is designed to extract image emotion features and model long-range dependencies of images. By stacking hybrid attention blocks in ResNet-style, we obtain an affective representation backbone. Furthermore, considering that image emotion is inseparable from aesthetics, we employ a modified ResNet to extract image aesthetics. Finally, through a fusion strategy, the image's emotion is considered with the aesthetics conveyed. Experiments demonstrate the close relationship between emotion and aesthetics, and our plan has an excellent competitive effect compared with existing methods on the image sentiment analysis dataset.

show abstract

Section: Deep Representations For Emotion Recognitionmentioning

confidence: 99%

Section: Modelmentioning

confidence: 99%

Section: Comparison To State Of the Artsmentioning

confidence: 99%

Section: Comparison To State Of the Artsmentioning

confidence: 99%

See 2 more Smart Citations

Exploring affective image representation with visual attention and aesthetic fusion

JiXiong,

Hao,

KangJian

et al. 2023

Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022)

View full text Add to dashboard Cite

show abstract

“…From 2005 to 2012, the same search yielded 591 results; from 2013 to 2021, the results of the same search had jumped to 3529. In that time, emotion classification has been the subject of analysis in diverse domains including image recognition [1], animation [2], root cause diagnosis [3], online reviews [4], and social network analysis [5][6][7]. Twitter, in particular, has become a lightning rod for researchers aiming to model human language through various machine learning techniques.…”

Section: Introductionmentioning

confidence: 99%

Emotion classification of Indonesian Tweets using Bidirectional LSTM

Glenn¹,

LaCasse²,

Cox³

2023

Neural Comput & Applic

View full text Add to dashboard Cite

Emotion classification can be a powerful tool to derive narratives from social media data. Traditional machine learning models that perform emotion classification on Indonesian Twitter data exist but rely on closed-source features. Recurrent neural networks can meet or exceed the performance of state-of-the-art traditional machine learning techniques using exclusively open-source data and models. Specifically, these results show that recurrent neural network variants can produce more than an 8% gain in accuracy in comparison with logistic regression and SVM techniques and a 15% gain over random forest when using FastText embeddings. This research found a statistical significance in the performance of a single-layer bidirectional long short-term memory model over a two-layer stacked bidirectional long short-term memory model. This research also found that a single-layer bidirectional long short-term memory recurrent neural network met the performance of a state-of-the-art logistic regression model with supplemental closed-source features from a study by Saputri et al. [8] when classifying the emotion of Indonesian tweets.

show abstract