“…As confirmation of this statement, 4 out of 12 tasks in the SemEval 2023 competition 4 were based on Twitter data. Monolingual models have been trained on tweets in languages such as English (Nguyen et al, 2020), Arabic (Antoun et al, 2020;Abdelali et al, 2021), French (Guo et al, 2021), Hebrew (Seker et al, 2022), Indonesian (Koto et al, 2021), Italian (Polignano et al, 2019) and Spanish (González et al, 2021;Pérez et al, 2022). Some of them were initialized with weights of existing general-domain models and adapted to Twitter data by continued pre-training, while others were trained on Twitter data from scratch.…”