Stereotype is a type of social bias massively present in texts that computational models use. There are stereotypes that present special difficulties because they do not rely on personal attributes. This is the case of stereotypes about immigrants, a social category that is a preferred target of hate speech and discrimination. We propose a new approach to detect stereotypes about immigrants in texts focusing not on the personal attributes assigned to the minority but in the frames, that is, the narrative scenarios, in which the group is placed in public speeches. We have proposed a fine-grained social psychology grounded taxonomy with six categories to capture the different dimensions of the stereotype (positive vs. negative) and annotated a novel StereoImmigrants dataset with sentences that Spanish politicians have stated in the Congress of Deputies. We aggregate these categories in two supracategories: one is Victims that expresses the positive stereotypes about immigrants and the other is Threat that expresses the negative stereotype. We carried out two preliminary experiments: first, to evaluate the automatic detection of stereotypes; and second, to distinguish between the two supracategories of immigrants’ stereotypes. In these experiments, we employed state-of-the-art transformer models (monolingual and multilingual) and four classical machine learning classifiers. We achieve above 0.83 of accuracy with the BETO model in both experiments, showing that transformers can capture stereotypes about immigrants with a high level of accuracy.
Controversial topics are present in the everyday life, and opinions about them can be either truthful or deceptive. Deceptive opinions are emitted to mislead other people in order to gain some advantage. In the most of the cases humans cannot detect whether the opinion is deceptive or truthful, however, computational approaches have been used successfully for this purpose. In this work, we evaluate a representation based on character n-grams features for detecting deceptive opinions. We consider opinions on the following: abortion, death penalty and personal feelings about the best friend; three domains studied in the state of the art. We found character n-grams effective for detecting deception in these controversial domains, even more than using psycholinguistic features. Our results indicate that this representation is able to capture relevant information about style and content useful for this task. This fact allows us to conclude that the proposed one is a competitive text representation with a good trade-off between simplicity and performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.