Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture

Jebbara, Soufian; Cimiano, Philipp

doi:10.1007/978-3-319-46565-4_12

Cited by 45 publications

(32 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To build the distantly supervised models, we use untagged reviews which are about the 8 products of USAGE. The baseline systems are (Klinger and Cimiano, 2014) and (Jebbara and Cimiano, 2016), which are state-of-the-art systems on the dataset.…”

Section: Results On Usage Corpusmentioning

confidence: 99%

Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network

Sun

Lan

et al. 2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

We investigate the task of open domain opinion relation extraction. Given a large number of unlabelled texts, we propose an efficient distantly supervised framework based on pattern matching and neural network classifiers. The patterns are designed to automatically generate training data, and the deep learning model is designed to capture various lexical and syntactic features. The result algorithm is fast and scalable on large-scale corpus. We test the system on the Amazon online review dataset, and show that the proposed model is able to achieve promising performances without any human annotations.

show abstract

Section: Results On Usage Corpusmentioning

confidence: 99%

Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network

Sun

Lan

et al. 2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

show abstract

“…Previous work in the direction of aspect-based sentiment analysis shows a positive impact of POS tag features for the extraction of opinion phrases and opinion target expressions (Toh and Wang, 2014;Jebbara and Cimiano, 2016). It stands to reason if the character-level word embeddings act in a similar way.…”

Section: Discussionmentioning

confidence: 99%

“…The word embedding matrix W wrd is initialized with a pretrained matrix of skip-gram embeddings trained on a corpus of amazon reviews , 2015). Earlier work showed that using a domain specific corpus in the pretraining stage significantly improves performance for similar tasks (Jebbara and Cimiano, 2016).…”

Section: Network Trainingmentioning

confidence: 99%

Improving Opinion-Target Extraction with Character-Level Word Embeddings

Jebbara

Cimiano

2017

Proceedings of the First Workshop on Subword and Character Level Models in NLP

Self Cite

View full text Add to dashboard Cite

Fine-grained sentiment analysis received increasing attention in recent years. Extracting opinion target expressions (OTE) in reviews is often an important step in fine-grained, aspect-based sentiment analysis. Retrieving this information from user-generated text, however, can be difficult. Customer reviews, for instance, are prone to contain misspelled words and are difficult to process due to their domainspecific language. In this work, we investigate whether character-level models can improve the performance for the identification of opinion target expressions. We integrate information about the character structure of a word into a sequence labeling system using character-level word embeddings and show their positive impact on the systems performance. Specifically, we obtain an increase by 3.3 points F 1 -score with respect to our baseline model. In further experiments, we reveal encoded character patterns of the learned embeddings and give a nuanced view of the performance differences of both models.

show abstract

“…Performing the same procedure with the regular skip-gram word embeddings results in no clear separation between the 6 suffix groups (see Figure 3b). Previous work in the direction of aspect-based sentiment analysis shows a positive impact of POS tag features for the extraction of opinion phrases and opinion target expressions (Toh and Wang, 2014;Jebbara and Cimiano, 2016). It stands to reason if the character-level word embeddings act in a similar way.…”

mentioning

confidence: 99%

“…The word embedding matrix W wrd is initialized with a pretrained matrix of skip-gram embeddings trained on a corpus of amazon reviews ( McAuley et al, 2015). Earlier work showed that using a domain specific corpus in the pretraining stage significantly improves performance for similar tasks (Jebbara and Cimiano, 2016). …”

mentioning

confidence: 99%

Proceedings of the First Workshop on Subword and Character Level Models in NLP

2017

View full text Add to dashboard Cite

IntroductionTraditional NLP starts with a hand-engineered layer of representation, the level of tokens or words. A tokenization component first breaks up the text into units using manually designed rules. Tokens are then processed by components such as word segmentation, morphological analysis and multiword recognition. The heterogeneity of these components makes it hard to create integrated models of both structure within tokens (e.g., morphology) and structure across multiple tokens (e.g., multi-word expressions). This approach can perform poorly (i) for morphologically rich languages, (ii) for noisy text, (iii) for languages in which the recognition of words is difficult and (iv) for adaptation to new domains; and (v) it can impede the optimization of preprocessing in end-to-end learning.The workshop provides a forum for discussing recent advances as well as future directions on sub-word and character-level natural language processing and representation learning that address these problems.We received 37 submissions, out of which we accepted 24 as papers and 4 as extended abstracts. AbstractMost of neural language models use different kinds of embeddings for word prediction. While word embeddings can be associated to each word in the vocabulary or derived from characters as well as factored morphological decomposition, these word representations are mainly used to parametrize the input, i.e. the context of prediction. This work investigates the effect of using subword units (character and factored morphological decomposition) to build output representations for neural language modeling. We present a case study on Czech, a morphologically-rich language, experimenting with different input and output representations. When working with the full training vocabulary, despite unstable training, our experiments show that augmenting the output word representations with character-based embeddings can significantly improve the performance of the model. Moreover, reducing the size of the output look-up table, to let the character-based embeddings represent rare words, brings further improvement. IntroductionMost of neural language models, such as n-gram models (Bengio et al., 2003) are word based and rely on the definition of a finite vocabulary V. Therefore, a look-up table maps each wordw ∈ V to a vector of real features, and is stored in a matrix. While this approach yields significant improvement for a variety of tasks and languages, see for instance (Schwenk, 2007) in speech recognition and (Le et al., 2012; Devlin et al., 2014; in machine translation, it induces several limitations.For morphologically-rich languages, like Czech or German, the lexical coverage is still an important issue, since there is a combinatorial explosion of word forms, most of which are hardly observed on training data. On the one hand, growing the look-up table is not a solution, since it would increase the number of parameters without having enough training examples for a proper estimation. On the other hand, rare words can be replaced...

show abstract

Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture

Cited by 45 publications

References 27 publications

Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network

Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network

Improving Opinion-Target Extraction with Character-Level Word Embeddings

Proceedings of the First Workshop on Subword and Character Level Models in NLP

Contact Info

Product

Resources

About