In the last few years, significant amounts of text data have emerged on the different social media platforms. A tendency to extract valuable information from these data for useful purposes has been created and developed. The Named Entity Recognition (NER), as a subtask of the Natural Language Processing (NLP), remains primordial in order to perform these extractions and the classification of entity names from the text regardless of its structure "formal or informal". Nevertheless, the most recent solutions for NER are confronted with the difficulty of adapting to the informal texts used on social media platforms. This work aims at providing a literature review of the various papers published in the field of NER on social media starting from 2014 until now, by identifying the particular characteristics surrounding the Arabic dialect compared to the English language.
The heavy involvement of the Arabic internet users resulted in spreading data written in the Arabic language and creating a vast research area regarding natural language processing (NLP). Sentiment analysis is a growing field of research that is of great importance to everyone considering the high added potential for decision-making and predicting upcoming actions using the texts produced in social networks. Arabic used in microblogging websites, especially Twitter, is highly informal. It is not compliant with neither standards nor spelling regulations making it quite challenging for automatic machine-learning techniques. In this paper’s scope, we propose a new approach based on AutoML methods to improve the efficiency of the sentiment classification process for dialectal Arabic. This approach was validated through benchmarks testing on three different datasets that represent three vernacular forms of Arabic. The obtained results show that the presented framework has significantly increased accuracy than similar works in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.