Purpose
Vast volumes of rich online consumer-generated content (CGC) can be used effectively to gain important insights for decision-making, product improvement and brand management. Recently, many studies have proposed semi-supervised aspect-based sentiment classification of unstructured CGC. However, most of the existing CGC mining methods rely on explicitly detecting aspect-based sentiments and overlooking the context of sentiment-bearing words. Therefore, this study aims to extract implicit context-sensitive sentiment, and handle slangs, ambiguous, informal and special words used in CGC.
Design/methodology/approach
A novel text mining framework is proposed to detect and evaluate implicit semantic word relations and context. First, POS (part of speech) tagging is used for detecting aspect descriptions and sentiment-bearing words. Then, LDA (latent Dirichlet allocation) is used to group similar aspects together and to form an attribute. Semantically and contextually similar words are found using the skip-gram model for distributed word vectorisation. Finally, to find context-sensitive sentiment of each attribute, cosine similarity is used along with a set of positive and negative seed words.
Findings
Experimental results using more than 400,000 Amazon mobile phone reviews showed that the proposed method efficiently found product attributes and corresponding context-aware sentiments. This method also outperforms the classification accuracy of the baseline model and state-of-the-art techniques using context-sensitive information on data sets from two different domains.
Practical implications
Extracted attributes can be easily classified into consumer issues and brand merits. A brand-based comparative study is presented to demonstrate the practical significance of the proposed approach.
Originality/value
This paper presents a novel method for context-sensitive attribute-based sentiment analysis of CGC, which is useful for both brand and product improvement.