Automatic facial expression analysis aims to analyse human facial expressions and classify them into discrete categories. Methods based on existing work are reliant on extracting information from video sequences and employ either some form of subjective thresholding of dynamic information or attempt to identify the particular individual frames in which the expected behaviour occurs. These methods are inefficient as they require either additional subjective information, tedious manual work or fail to take advantage of the information contained in the dynamic signature from facial movements for the task of expression recognition.In this paper, a novel framework is proposed for automatic facial expression analysis which extracts salient information from video sequences but does not rely on any subjective preprocessing or additional user-supplied information to select frames with peak expressions. The experimental framework demonstrates that the proposed method outperforms static expression recognition systems in terms of recognition rate. The approach does not rely on action units (AUs) and therefore, eliminates errors which are otherwise propagated to the final result due to incorrect initial identification of AUs. The proposed framework explores a parametric space of over 300 dimensions and is tested with six state-of-the-art machine learning techniques. Such robust and extensive experimentation provides an important foundation for the assessment of the performance for future work. A further contribution of the paper is offered in the form of a user study. This was conducted in order to investigate the correlation between human cognitive systems and the proposed framework for the understanding of human emotion classification and the reliability of public databases.
Over the past decade, computer scientists and psychologists have made great efforts to collect and analyze facial dynamics data that exhibit different expressions and emotions. Such data is commonly captured as videos and are transformed into feature-based time-series prior to any analysis. However, the analytical tasks, such as expression classification, have been hindered by the lack of understanding of the complex data space and the associated algorithm space. Conventional graph-based time-series visualization is also found inadequate to support such tasks. In this work, we adopt a visual analytics approach by visualizing the correlation between the algorithm space and our goal -classifying facial dynamics. We transform multiple feature-based time-series for each expression in measurement space to a multi-dimensional representation in parameter space. This enables us to utilize parallel coordinates visualization to gain an understanding of the algorithm space, providing a fast and cost-effective means to support the design of analytical algorithms.
We present the Cardiff Conversation Database (CCDb), a unique 2D audiovisual database containing natural conversations between pairs of people. The database currently contains 30 conversations. To date, eight conversations are fully annotated for speaker activity, facial expressions, head motion, and non-verbal utterances. In this paper we describe the data collection and annotation process. We also provide results of baseline experiments in which an SVM classifier was used to identify which parts of the recordings are from the frontchannel speaker and which are backchannel signals. We believe this database will make a useful contribution to computer vision, affective computing, and cognitive science communities by providing raw data, features, annotations and baseline comparisons.
Abstract. The goal of the present research was to study the relative role of facial and acoustic cues in the formation of trustworthiness impressions. Furthermore, we investigated the relationship between perceived trustworthiness and perceivers' confidence in their judgments. 25 young adults watched a number of short clips in which the video and audio channel were digitally aligned to form five different combinations of actors' face and voice trustworthiness levels (neutral face + neutral voice, neutral face + trustworthy voice, neutral face + non-trustworthy voice, trustworthy face + neutral voice, and non-trustworthy face + neutral voice). Participants provided subjective ratings of the trustworthiness of the actor in each video, and indicated their level of confidence in each of those ratings. Results revealed a main effect of face-voice channel combination on trustworthiness ratings, and no significant effect of channel combination on confidence ratings. We conclude that there is a clear superiority effect of facial over acoustic cues in the formation of trustworthiness impressions, propose a method for future investigation of the judgment-confidence link, and outline the practical implications of the experiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.