This paper is focused on aspect term extraction in aspect-based sentiment analysis (ABSA), which is one of the hot spots in natural language processing (NLP). This paper proposes MFE-CRF that introduces Multi-Feature Embedding (MFE) clustering based on the Conditional Random Field (CRF) model to improve the effect of aspect term extraction in ABSA. First, Multi-Feature Embedding (MFE) is proposed to improve the text representation and capture more semantic information from text. Then the authors use kmeans++ algorithm to obtain MFE and word clustering to enrich the position features of CRF. Finally, the clustering classes of MFE and word embedding are set as the additional position features to train the model of CRF for aspect term extraction. The experiments on SemEval datasets validate the effectiveness of this model. The results of different models indicate that MFE-CRF can greatly improve the Recall rate of CRF model. Additionally, the Precision rate also is increased obviously when the semantics of text is complex.
Extractive summarization aims to produce a concise version of a document by extracting information-rich sentences from the original texts. The graph-based model is an effective and efficient approach to rank sentences since it is simple and easy to use. However, its performance depends heavily on good text representation. In this paper, an integrated graph model (iGraph) for extractive text summarization is proposed. An enhanced embedding model is used to detect the inherent semantic properties at the word level, bigram level and trigram level. Words with part-of-speech (POS) tags, bigrams and trigrams were extracted to train the embedding models. Based on the enhanced embedding vectors, the similarity values between the sentences were calculated in three perspectives. The sentences in the document were treated as vertexes and the similarity between them as edges. As a result, three different types of semantic graphs were obtained for every document, with the same nodes and different edges. These three graphs were integrated into one enriched semantic graph in a naive Bayesian fashion. After that, TextRank, which is a graph-based ranking algorithm, was applied to rank the sentences, before the top scored sentences were selected for the summary according to the compression rate. Evaluated on the DUC 2002 and DUC 2004 datasets, our proposed method shows competitive performance compared to the state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.