“…After the completion of this process, 50% of the annotated passages were cross-annotated by another factchecker to check the interannotator agreement scores. Cohen's kappa (κ, McHugh, 2012), Krippendorff 's alpha (α, Krippendorff, 2011), intraclass correlation coefficient (two-way mixed, average score ICC(3, k) for k = 2; Cicchetti, 1994) and accuracy were measured, which is the percentage of passages where both annotators agreed (Maronikolakis et al, 2022). For the three labels of derogatory, exclusionary and dangerous speech, we obtained the values of κ = 0.23, α = 0.24 and ICC(3, k) = 0.41, which is considered "fair" (Cicchetti, 1994;Maronikolakis et al, 2022).…”