This paper presents a set of experiments to evaluate and compare between the performance of using CBOW Word2Vec and Lemma2Vec models for Arabic Word-in-Context (WiC) disambiguation without using sense inventories or sense embeddings. As part of the SemEval-2021 Shared Task 2 on WiC disambiguation, we used the dev.ar-ar dataset (2k sentence pairs) to decide whether two words in a given sentence pair carry the same meaning. We used two Word2Vec models: Wiki-CBOW, a pre-trained model on Arabic Wikipedia, and another model we trained on large Arabic corpora of about 3 billion tokens. Two Lemma2Vec models was also constructed based on the two Word2Vec models. Each of the four models was then used in the WiC disambiguation task, and then evaluated on the SemEval-2021 test.ar-ar dataset. At the end, we reported the performance of different models and compared between using lemma-based and word-based models.
This paper presents ongoing work on the extraction of Arabic reported speech, made by Lebanese politicians, from Arabic Lebanese newspapers. This work is part of a functional system for extraction, presentation and archiving of reported speech made by Lebanese politicians, which constitutes a valuable resource for political analysts, press agents, company researchers and political actors. The system automatically identifies about 280 reported speeches per day from about 1,000 newspaper articles, together with their referents, all are correctly identified as reported speech but only about 200 are correctly referred to their referents. The correctly identified reported speeches that refer to the correctly identified referents are then submitted to a web-based application, which is publicly accessible at http://citations-explorer.com/lpc/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.