Affect detection is an important pattern recognition problem that has inspired researchers from several areas. The field is in need of a systematic review due to the recent influx of Multimodal (MM) affect detection systems that differ in several respects and sometimes yield incompatible results. This article provides such a survey via a quantitative review and meta-analysis of 90 peer-reviewed MM systems. The review indicated that the state of the art mainly consists of person-dependent models (62.2% of systems) that fuse audio and visual (55.6%) information to detect acted (52.2%) expressions of basic emotions and simple dimensions of arousal and valence (64.5%) with feature-(38.9%) and decision-level (35.6%) fusion techniques. However, there were also person-independent systems that considered additional modalities to detect nonbasic emotions and complex dimensions using model-level fusion techniques. The meta-analysis revealed that MM systems were consistently (85% of systems) more accurate than their best unimodal counterparts, with an average improvement of 9.83% (median of 6.60%). However, improvements were three times lower when systems were trained on natural (4.59%) versus acted data (12.7%). Importantly, MM accuracy could be accurately predicted (cross-validated R 2 of 0.803) from unimodal accuracies and two system-level factors. Theoretical and applied implications and recommendations are discussed. ACM Reference Format:Sidney K. D'Mello and Jacqueline Kory. 2015. A review and meta-analysis of multimodal affect detection systems.
Abstract-Children's oral language skills in preschool can predict their academic success later in life. As such, increasing children's skills early on could improve their success in middle and high school. To this end, we propose that a robotic learning companion could supplement children's early language education. The robot targets both the social nature of language learning and the adaptation necessary to help individual children. The robot is designed as a social character that interacts with children as a peer, not as a tutor or teacher. It will play a storytelling game, during which it will introduce new vocabulary words, and model good story narration skills, such as including a beginning, middle, and end; varying sentence structure; and keeping cohesion across the story. We will evaluate whether adapting the robot's level of language to the child's -so that, as children improve their storytelling skills, so does the robot -influences (i) whether children learn new words from the robot, (ii) the complexity and style of stories children tell, (iii) the similarity of children's stories to the robot's stories. We expect children will learn more from a robot that adapts to maintain an equal or greater ability than the children, and that they will copy its stories and narration style more than they would with a robot that does not adapt (a robot of lesser ability). However, we also expect that playing with a robot of lesser ability could prompt teaching or mentoring behavior from children, which could also be beneficial to language learning.
The ability to reliably and ethically elicit affective states in the laboratory is critical when studying and developing systems that can detect, interpret, and adapt to human affect. Many methods for eliciting emotions have been developed. In general, they involve presenting a stimulus to evoke a response from one or more emotion response systems. The nature of the stimulus varies widely. Passive methods include the presentation of images, film clips, and music. Active methods can involve social or dyadic interactions with other people, or behavior manipulations in which an individual is instructed to adopt facial expressions, postures, or other emotionally-relevant behaviors. This chapter discusses exemplar methods of each type, discusses advantages and disadvantages of each method, and briefly summarizes some additional methods.Keywords/keyphrases affect elicitation, emotional images, emotional film clips, emotional music, backward masking, behavior manipulation, social interaction, dyadic interaction Glossary Affect elicitation: Methods used to evoke (or induce) affective responses in individuals. These methods generally involve presenting a stimulus or immersing the subject in a situation to evoke a response from one or more emotion response systems.The nature of the stimulus varies and could include the presentation of images, film, or music; adopting facial expressions or postures; and social or dyadic interactions, among others.Emotional images: Digital images or photographs that have been carefully selected and evaluated for their potential to evoke affective states in the viewer.Emotional film clips: Short movie segments, usually including both images and sound, which have been selected and evaluated for their potential to evoke affective states in the viewer.Emotional music: A recorded musical piece that has been selected and evaluated for its ability to evoke affective states in the listener.Backward masking: A method used to block conscious awareness of a visual stimulus. The target stimulus is shown to an individual very briefly (such as 15-60ms), followed immediately by a "mask" stimulus shown for a longer time (such as 500ms).Individuals report being consciously aware of only the mask.Behavior manipulation: A method in which individuals are instructed to adopt particular behaviorssuch as body postures or facial expressionsin order to induce particular affective states.Social interaction: A relationship between two or more individuals, that may be fleeting or enduring, in which an individual's actions and behavior are responsive to the actions and behavior of the other individuals. In one affect elicitation method, researchers try to create realistic social interaction scenarios that might evoke emotions in a more naturalistic context. Dyadic interaction: A social interaction specifically between two individuals (seeSocial Interaction). One affect elicitation method focuses on bringing pairs of individuals together to engage in an unrehearsed, minimally structured conversation in order to evoke aff...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.