This paper describes in detail the approach carried out by the GTI research group for Se-mEval 2016 Task 5: Aspect-Based Sentiment Analysis, for the different subtasks proposed, as well as languages and dataset contexts. In particular, we developed a system for category detection based on SVM. Then for the opinion target detection task we developed a system based on CRFs. Both are built for restaurants domain in English and Spanish languages. Finally for aspect-based sentiment analysis we carried out an unsupervised approach based on lexicons and syntactic dependencies, in English language for laptops and restaurants domains.
Short texts are omnipresent in real-time news, social network commentaries, etc. Traditional text representation methods have been successfully applied to self-contained documents of medium size. However, information in short texts is often insufficient, due, for example, to the use of mnemonics, which makes them hard to classify. Therefore, the particularities of specific domains must be exploited. In this article we describe a novel system that combines Natural Language Processing techniques with Machine Learning algorithms to classify banking transaction descriptions for personal finance management, a problem that was not previously considered in the literature. We trained and tested that system on a labelled dataset with real customer transactions that will be available to other researchers on request. Motivated by existing solutions in spam detection, we also propose a short text similarity detector to reduce training set size based on the Jaccard distance. Experimental results with a two-stage classifier combining this detector with a SVM indicate a high accuracy in comparison with alternative approaches, taking into account complexity and computing time. Finally, we present a use case with a personal finance application, CoinScrap, which is available at Google Play and App Store. INDEX TERMS Machine learning, natural language processing, banking, personal finance management.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.