“…Nevertheless, ELMO comparison with other methods is still inconclusive and limited because it is a novel technology. On the other hand, in comparison to word-level deep networks, character-level text processing may concentrate less emphasis on recording high-level associations between words, and this approach is significantly more compact and uses fewer memory resources ( Wullach, Adler & Minkov, 2021 ; Zhang, Robinson & Tepper, 2018 ). There are some character-level approaches, such as Canine ( Clark et al, 2021 ), CharBert ( Ma et al, 2020 ), CharacterBERT ( El Boukkouri et al, 2021 ), and Charformer models ( Tay et al, 2022 ), but those approaches are rarely used for abusive content detection tasks.…”