With the increase in internet access and the ease of writing comments in the Nepali language, fine-grained sentiment analysis of social media comments is becoming more and more pertinent. There are a number of benchmarked datasets for high-resource languages (English, French, and German) in specific domains like restaurants, hotels or electronic goods but not in low-resource languages like Nepali. In this paper, we present our work to create a dataset for the targeted aspect-based sentiment analysis in the social media domain, set up a dataset benchmark and evaluate using various machine learning models. The dataset comprises of code-mixed and code-switched comments extracted from Nepali YouTube videos. We present convincing baselines using a multilingual BERT model for the Aspect Term Extraction task and BiLSTM model for the Sentiment Classification Task achieving 57.978% and 81.60% F1 score respectively.
Named Entity Recognition have been studied for different languages like English, German, Spanish and many others but no study have focused on Nepali language. In this paper we propose a neural based Nepali NER using latest state-of-the-art architecture based on grapheme-level which doesn't require any hand-crafted features and no data pre-processing. Our novel neural based model gained relative improvement of 33% to 50% compared to feature based SVM model and up to 10% improvement over state-of-the-art neural based models developed for languages beside Nepali.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.