Visual sentiment analysis is attracting more and more attention with the increasing tendency to express emotions through visual contents. Recent algorithms in Convolutional Neural Networks (CNNs) considerably advance the emotion classification, which aims to distinguish differences among emotional categories and assigns a single dominant label to each image. However, the task is inherently ambiguous since an image usually evokes multiple emotions and its annotation varies from person to person. In this work, we address the problem via label distribution learning and develop a multi-task deep framework by jointly optimizing classification and distribution prediction. While the proposed method prefers to the distribution datasets with annotations of different voters, the majority voting scheme is widely adopted as the ground truth in this area, and few dataset has provided multiple affective labels. Hence, we further exploit two weak forms of prior knowledge, which are expressed as similarity information between labels, to generate emotional distribution for each category. The experiments conducted on both distribution datasets, i.e. Emotion6, Flickr LDL, Twitter LDL, and the largest single label dataset, i.e. Flickr and Instagram, demonstrate the proposed method outperforms the state-of-the-art approaches.
Automatic assessment of sentiment from visual content has gained considerable attention with the increasing tendency of expressing opinions on-line. In this paper, we solve the problem of visual sentiment analysis using the high-level abstraction in the recognition process. Existing methods based on convolutional neural networks learn sentiment representations from the holistic image appearance. However, different image regions can have a different influence on the intended expression. This paper presents a weakly supervised coupled convolutional network with two branches to leverage the localized information. The first branch detects a sentiment specific soft map by training a fully convolutional network with the cross spatial pooling strategy, which only requires image-level labels, thereby significantly reducing the annotation burden. The second branch utilizes both the holistic and localized information by coupling the sentiment map with deep features for robust classification. We integrate the sentiment detection and classification branches into a unified deep framework and optimize the network in an end-to-end manner. Extensive experiments on six benchmark datasets demonstrate that the proposed method performs favorably against the state-ofthe-art methods for visual sentiment analysis.
Abstract-Automatic assessment of sentiment from visual content has gained considerable attention with the increasing tendency of expressing opinions via images and videos online. This paper investigates the problem of visual sentiment analysis, which involves a high-level abstraction in the recognition process. While most of the current methods focus on improving holistic representations, we aim to utilize the local information, which is inspired by the observation that both the whole image and local regions convey significant sentiment information. We propose a framework to leverage affective regions, where we first use an off-the-shelf objectness tool to generate the candidates, and employ a candidate selection method to remove redundant and noisy proposals. Then a convolutional neural network (CNN) is connected with each candidate to compute the sentiment scores, and the affective regions are automatically discovered, taking the objectness score as well as the sentiment score into consideration. Finally, the CNN outputs from local regions are aggregated with the whole images to produce the final predictions. Our framework only requires image-level labels, thereby significantly reducing the annotation burden otherwise required for training. This is especially important for sentiment analysis as sentiment can be abstract, and labeling affective regions is too subjective and labor-consuming. Extensive experiments show that the proposed algorithm outperforms the state-of-the-art approaches on eight popular benchmark datasets.
Emotion analysis of on-line user generated textual content is important for natural language processing and social media analytics tasks. Most of previous emotion analysis approaches focus on identifying users’ emotional states from text by classifying emotions into one of the finite categories, e.g., joy, surprise, anger and fear. However, there exists ambiguity characteristic for the emotion analysis, since a single sentence can evoke multiple emotions with different intensities. To address this problem, we introduce emotion distribution learning and propose a multi-task convolutional neural network for text emotion analysis. The end-to-end framework optimizes the distribution prediction and classification tasks simultaneously, which is able to learn robust representations for the distribution dataset with annotations of different voters. While most work adopt the majority voting scheme for the ground truth labeling, we also propose a lexiconbased strategy to generate distributions from a single label, which provides prior information for the emotion classification. Experiments conducted on five public text datasets (i.e., SemEval, Fairy Tales, ISEAR, TEC, CBET) demonstrate that our proposed method performs favorably against the state-of-the-art approaches.
Accurate grading of skin disease severity plays a crucial role in precise treatment for patients. Acne vulgaris, the most common skin disease in adolescence, can be graded by evidence-based lesion counting as well as experiencebased global estimation in the medical field. However, due to the appearance similarity of acne with close severity, it is challenging to count and grade acne accurately. In this paper, we address the problem of acne image analysis via Label Distribution Learning (LDL) considering the ambiguous information among acne severity. Based on the professional grading criterion, we generate two acne label distributions considering the relationship between the similar number of lesions and severity of acne, respectively. We also propose a unified framework for joint acne image grading and counting, which is optimized by the multi-task learning loss. In addition, we further build the ACNE04 dataset with annotations of acne severity and lesion number of each image for evaluation. Experiments demonstrate that our proposed framework performs favorably against stateof-the-art methods. We make the code and dataset publicly available at https://github.com/xpwu95/ldl.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.