2019
DOI: 10.1007/978-3-030-27947-9_9
|View full text |Cite
|
Sign up to set email alerts
|

The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English

Abstract: 0000−0001−7169−9152] , Darja Fišer 2,1[0000−0002−9956−1689] , and Tomaž Erjavec 1[0000−0002−1560−4099]Abstract. In this paper we present datasets of Facebook comment threads to mainstream media posts in Slovene and English developed inside the Slovene national project FRENK 3 which cover two topics, migrants and LGBT, and are manually annotated for different types of socially unacceptable discourse (SUD). The main advantages of these datasets compared to the existing ones are identical sampling procedures, pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
31
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(34 citation statements)
references
References 7 publications
2
31
0
1
Order By: Relevance
“…FRENK (Ljubešić et al, 2019) The FRENK datasets consist of Facebook comments in English and Slovene covering LGBT and migrant topics. The datasets were manually annotated for finegrained types of socially unacceptable discourse (e.g., violence, offensiveness, threat).…”
Section: Datamentioning
confidence: 99%
“…FRENK (Ljubešić et al, 2019) The FRENK datasets consist of Facebook comments in English and Slovene covering LGBT and migrant topics. The datasets were manually annotated for finegrained types of socially unacceptable discourse (e.g., violence, offensiveness, threat).…”
Section: Datamentioning
confidence: 99%
“…The English language is well-resourced and researched [19,22,24]. Recently, hate speech detection studies appeared for Croatian [25,27,29] and Slovene [31,33,34].…”
Section: Hate Speech Detectionmentioning
confidence: 99%
“…The Slovene dataset was produced in the Slovenian national project FRENK 6 . The text dataset used in the experiment is a combination of two different studies of Facebook comments [33]. The first group of comments was collected on LGBT homophobia topics, while the second on antimigrants posts.…”
Section: Hate Speech Datasetsmentioning
confidence: 99%
“…The Dutch LiLaH corpus consists of approximately 36,000 Facebook comments on online news articles related to migrants or the LGBT community mined from three popular Flemish newspaper pages (HLN, Het Nieuwsblad and VRT) 2 . The corpus, which has been used in several recent studies on hate speech detection in Dutch, e.g., (Markov et al, 2021;Ljubešić et al, 2020), was annotated for the type and target of hateful comments following the same procedure and annotation guidelines as presented in (Ljubešić et al, 2019), that is, with respect to the type of hate speech, the possible classes were violent speech and offensive speech (either triggered by the target's personal background, e.g., religion, gender, sexual orientation, nationality, etc., or on the basis of individual characteristics), inappropriate speech (without a specific target), and appropriate speech. The targets, on the other hand, were divided into migrants and the LGBT community, people related to either of these communities (e.g., people who support them), the journalist who wrote or medium that provided the article, another commenter, other targets and no target.…”
Section: Corpus Descriptionmentioning
confidence: 99%