One crucial aspect of democracy is fair information sharing. While it is hard to prevent biases in news, they should be identified for better transparency. We propose an approach to automatically characterize biases that takes into account structural differences and that is efficient for long texts. This yields new ways to provide explanations for a textual classifier, going beyond mere lexical cues. We show that: (i) the use of discourse-based structure-aware document representations compare well to local, computationally heavy, or domain-specific models on classification tasks that deal with textual bias (ii) our approach based on different levels of granularity allows for the generation of better explanations of model decisions, both at the lexical and structural level, while addressing the challenge posed by long texts.
This paper describes our approach to Subtask 1 "News Genre Categorization" of SemEval-2023 Task 3 "Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup", which aims to determine whether a given news article is an opinion piece, an objective report, or satirical. We fine-tuned the domain-specific language model POLITICS, which was pre-trained on a large-scale dataset of more than 3.6M English political news articles following ideologydriven pre-training objectives. In order to use it in the multilingual setup of the task, we added as a pre-processing step the translation of all documents into English. Our system ranked among the top systems overall in most language, and ranked 1st on the English dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.