Affect recognition is a difficult problem that most often relies on human annotated data to train automated systems. As humans perceive emotion differently based on personality, cognitive state and past experiences, it is important to collect rankings from multiple individuals to assess the emotional content in corpora, which are later aggregated with rules such as majority vote. With the increased use of crowdsourcing services for perceptual evaluations, collecting large amount of data is now feasible. It becomes important to question the amount of data needed to create well-trained classifiers. How different are the aggregated labels collected from five raters compared to the ones obtained from twenty evaluators? Is it worthwhile to spend resources to increase the number of evaluators beyond those used in conventional/laboratory studies? This study evaluates the consensus labels obtained by incrementally adding new evaluators during perceptual evaluations. Using majority vote over categorical emotional labels, we compare the changes in the aggregated labels starting with one rater, and finishing with 20 raters. The large number of evaluators in a subset of the MSP-IMPROV database and the ability to filter annotators by quality allows us to better understand label aggregation as a function of the number of annotators.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.