The work of A. Dridi was supported by the Faculty of Computing, Engineering and Built Environment, Birmingham City University, through a Full Bursary Ph.D. Scholarship.
In this paper, a fine-grained supervised approach is proposed to identify bullish and bearish sentiments associated with companies and stocks, by predicting a real-valued score between − 1 and + 1. We propose a supervised approach learned by using several feature sets, consisting of lexical features, semantic features and a combination of lexical and semantic features. Our study reveals that semantic features, most notably BabelNet synsets and semantic frames, can be successfully applied for Sentiment Analysis within the financial domain to achieve better results. Moreover, a comparative study has been conducted between our supervised approach and unsupervised approaches. The obtained experimental results show how our approach outperforms the others.
Word embeddings are increasingly attracting the attention of researchers dealing with semantic similarity and analogy tasks. However, finding the optimal hyper-parameters remains an important challenge due to the resulting impact on the revealed analogies mainly for domainspecific corpora. While analogies are highly used for hypotheses synthesis, it is crucial to optimise word embedding hyper-parameters for precise hypothesis synthesis. Therefore, we propose, in this paper, a methodological approach for tuning word embedding hyper-parameters by using the stability of k-nearest neighbors of word vectors within scientific corpora and more specifically Computer Science corpora with Machine learning adopted as a case study. This approach is tested on a dataset created from NIPS 1 publications, and evaluated with a curated ACM hierarchy and Wikipedia Machine Learning outline as the gold standard. Our quantitative and qualitative analysis indicate that our approach not only reliably captures interesting patterns like "unsupervised learning is to kmeans as supervised learning is to knn", but also captures the analogical hierarchy structure of Machine Learning and consistently outperforms the 61% sate-of-the-art embeddings on syntactic accuracy with 68%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.