Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings

Saroufim, Carl; Almatarky, Akram; Abdelhady, Mohamed

doi:10.18653/v1/w18-6204

Cited by 11 publications

(5 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is therefore hardly surprising that most efforts have focused on the application of machine learning algorithms, which produce classifiers whose statistical model is learnt, typically, in a supervised fashion from a corpus of labeled examples, such as user reviews (Pang et al 2002;Turney 2002) or tweets (Nakov 2016). The techniques to improve performance of classifiers have progressively gained computational complexity over the years; at present, supervised methods such as Support Vector Machine (SVM), Bayesian Networks and Neural Networks, among others are commonplace, as are unsupervised learning techniques based on pre-trained word embeddings (such as Word2vec) or clustering techniques, e.g., Tang et al 2014;Kalchbrenner et al 2014;Kim 2014;Fu et al 2018;Saroufim et al 2018), with or without the aid of external lexical resources (sentiment dictionaries).…”

Section: Related Workmentioning

confidence: 99%

Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector

Ortiz

Sallés

Orrequia-Barea

2019

Inf Technol Tourism

View full text Add to dashboard Cite

The use of linguistic resources beyond the scope of language studies, e.g., commercial purposes, has become commonplace since the availability of massive amounts of data and the development of software tools to process them. An interesting perspective on these data is provided by Sentiment Analysis, which attempts to identify the polarity of a text, but can also pursue further, more challenging aims, such as the automatic identification of the specific entities and aspects being discussed in the evaluative speech act, along with the polarity associated with them. This approach, known as Aspect-based Sentiment Analysis, seeks to offer fine-grained information from raw text, but its success depends largely on the existence of pre-annotated domain-specific corpora, which in turn calls for the design and validation of an annotation schema. This paper examines the methodological aspects involved in the creation of such annotation schema and is motivated by the scarcity of information found in the literature. We describe the insights we obtained from the annotation schema generation and validation process within our project, 1 whose objectives include the development of advanced sentiment analysis software of user reviews in the tourism sector. We focus on the identification of the relevant entities and attributes in the domain, which we extract from a corpus of user reviews, and go on to describe the schema creation and validation process. We begin by describing the corpus annotation process and its further iterative refinement by means of several interannotator agreement measurements, which we believe is key to a successful annotation schema.

show abstract

Section: Related Workmentioning

confidence: 99%

Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector

Ortiz

Sallés

Orrequia-Barea

2019

Inf Technol Tourism

View full text Add to dashboard Cite

show abstract

“…In [13], the authors suggested rating tweets in French using a dataset of positive and negative emojis and training them to include Sentiment Specific Word Embeddings (SSWE) on top of an unsupervised Word2Vec model. It updated the embedding through deep learning with bidirectional LSTM on the auto-labeled data.…”

Section: Related Workmentioning

confidence: 99%

Data Mining for Cyberbullying and Harassment Detection in Arabic Texts

Bashir¹,

Bouguessa²

2021

IJITCS

View full text Add to dashboard Cite

Broadly cyberbullying is viewed as a severe social danger that influences many individuals around the globe, particularly young people and teenagers. The Arabic world has embraced technology and continues using it in different ways to communicate inside social media platforms. However, the Arabic text has drawbacks for its complexity, challenges, and scarcity of its resources. This paper investigates several questions related to the content of how to protect an Arabic text from cyberbullying/harassment through the information posted on Twitter. To answer this question, we collected the Arab corpus covering the topics with specific words, which will explain in detail. We devised experiments in which we investigated several learning approaches. Our results suggest that deep learning models like LSTM achieve better performance compared to other traditional yberbullying classifiers with an accuracy of 72%.

show abstract

“…Another issue with pre-trained embedding is that it does not account for sentiment polarity of words and might map words with opposite polarities to vectors closer to each other in Euclidean space. A novel sentiment-specific word embedding is proposed for language-independent sentiment analysis, which shows an improvement over traditional pre-trained embeddings of word2vec [12].…”

Section: Related Workmentioning

confidence: 99%

Language Independent Sentiment Analysis

Shakeel

Faizullah

Alghamidi

et al. 2020

2019 International Conference on Advances in the Emerging Computing Technologies (AECT)

View full text Add to dashboard Cite

Social media platforms and online forums generate a rapid and increasing amount of textual data. Businesses, government agencies, and media organizations seek to perform sentiment analysis on this rich text data. The results of these analytics are used for adapting marketing strategies, customizing products, security, and various other decision makings. Sentiment analysis has been extensively studied and various methods have been developed for it with great success. These methods, however, apply to texts written in a specific language. This limits the applicability to a particular demographic and geographic region. In this paper, we propose a general approach for sentiment analysis on data containing texts from multiple languages. This enables all the applications to utilize the results of sentiment analysis in a language oblivious or language-independent fashion.

show abstract

Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings

Cited by 11 publications

References 15 publications

Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector

Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector

Data Mining for Cyberbullying and Harassment Detection in Arabic Texts

Language Independent Sentiment Analysis

Contact Info

Product

Resources

About