“…Some other studies adopted several neural-based models, including convolutional neural networks (CNN) [75,141], long short-term memory (LSTM) [8,75,92,94,145], bidirectional LSTM (Bi-LSTM) [115], and gated recurrent unit (GRU) [27]. The most recent works focus more on investigating transferability or generalizability of stateof-the-art transformer-based models such as Bidirectional Encoder Representations from Transformers (BERT) [19,48,66,79,83,90,92,134] and its variant like RoBERTa [48] in the cross-domain abusive language detection task.…”