Proceedings of the 13th International Workshop on Semantic Evaluation 2019
DOI: 10.18653/v1/s19-2077
|View full text |Cite
|
Sign up to set email alerts
|

LT3 at SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (hatEval)

Abstract: This paper describes our contribution to the SemEval-2019 Task 5 on the detection of hate speech against immigrants and women in Twitter (hatEval). We considered a supervised classification-based approach to detect hate speech in English tweets, which combines a variety of standard lexical and syntactic features with specific features for capturing offensive language. Our experimental results show good classification performance on the training data, but a considerable drop in recall on the held-out test set.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 9 publications
1
4
0
Order By: Relevance
“…Such a finding is consistent with previous studies that showed that sentiments are effective in showing a large number of hate speech contents [37], [38], [54]. In addition, the findings are consistent with works related to LIWC as additional features showing human behavior [46].…”
Section: E Application Of the Proposed Methods (Model Learning)supporting
confidence: 93%
“…Such a finding is consistent with previous studies that showed that sentiments are effective in showing a large number of hate speech contents [37], [38], [54]. In addition, the findings are consistent with works related to LIWC as additional features showing human behavior [46].…”
Section: E Application Of the Proposed Methods (Model Learning)supporting
confidence: 93%
“…Conversely, our ensemble model has obtained 0.62 of F1 score for task B which exceeds the best systems of HatEval Task B -LT3 (Bauwelinck et al, 2019) (F1 score = 0.47) and MFC baseline (F1 score = 0.42) (Basile et al, 2019). In Task B of HatEval, no system has been able to outperform the EMR score of MFC baseline, which achieved 0.58 of EMR (Note: Exact Matching Ratio was the metric used for HatEval Task B evaluation).…”
Section: Answering Rq1 -Model Evaluationmentioning
confidence: 63%
“…We also achieved 0.62 EMR for Task B on test set. Due to the difficulty in replicating LT3 system (Bauwelinck et al, 2019) to train on 'adjusted' dataset, we obtained performance of 'SVM+USE' model (Indurthi et al, 2019) using our 'adjusted' dataset. As shown in Figure 4, our model and baseline demonstrated equal performance in Task A. Conversely, our model outperforms 'SVM+USE' baseline by a margin of 0.06 in Task B.…”
Section: Answering Rq1 -Model Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…For sub-task B, the multi-classification task, we replicated the winning approach (Bauwelinck et al, 2019) by training three separate classifiers to classify three label pairs individually; these classifiers used a linear SVM on handcrafted syntactic, lexical and bag-of-words features. The optimal hyperparameters were found using grid search.…”
Section: Modelingmentioning
confidence: 99%