Human annotations can help indexing digital resources as well as improving search and recommendation systems. Human annotators may carry their bias and stereotypes in the labels they create when annotating digital content. These are then reflected in machine learning models trained with such data. The result is a reinforcement loop where end-users are pushed stereotypical content by the search and recommendation systems they use on a daily basis. In order to break the loop, the impact on models of using diverse data that can better represent a diverse population has been looked at.In this work, we look at how human annotators in the US annotate digital content different from content which is popular on the Web and social media. We present the results of a controlled user study in which participants are asked to annotate videos of common tasks performed by people from various socio-economic backgrounds around the world. We test for the presence of social stereotypes and investigate the diversity of the provided annotations, especially since some abstract labels may reveal information about annotators' emotional state and judgment. We observe different types of annotations for content from different socio-economic levels. Furthermore, we find regional and income level biases in annotation sentiment.
CCS CONCEPTS• Human-centered computing → Empirical studies in collaborative and social computing.