In recent years, scholars have paid increasing attention to the joint entity and relation extraction. However, the most difficult aspect of joint extraction is extracting overlapping triples. To address this problem, we propose a joint extraction model based on Soft Pruning and GlobalPointer, short for SGNet. In the first place, the BERT pretraining model is used to obtain the text word vector representation with contextual information, and then the local and non-local information of the word vector is obtained through graph operations. Specifically, to address the lack of information caused by the rule-based pruning strategies, we utilize the Gaussian Graph Generator and the attention-guiding layer to construct a fully connected graph. This process is called soft pruning for short. Then, to achieve node message passing and information integration, we employ GCNs and a thick connection layer. Next, we use the GlobalPointer decoder to convert triple extraction into quintuple extraction to tackle the problem of problematic overlapping triples extraction. The GlobalPointer decoder, unlike the typical feedforward neural network (FNN), can perform joint decoding. In the end, to evaluate the model performance, the experiment was carried out on two public datasets: the NYT and WebNLG. The experiments show that SGNet performs substantially better on overlapping extraction and achieves good results on two publicly available datasets.
Aiming at the sparsity of short text features, lack of context, and the inability of word embedding and external knowledge bases to supplement short text information, this paper proposes a text, word and POS tag-based graph convolutional network (TWPGCN) performs short text classification. This paper builds a T-W graph of text and words, a W-W graph of words and words, and a W-P graph of words and POS tags, and uses Graph Convolutional Network (GCN) to learn its feature and performs feature fusion. TWPGCN only focuses on the structural information of text graph, and does not require pre-training word embedding as initial node features, which improves classification accuracy, increases computational efficiency, and reduces computational difficulty. Experimental results show that TWPGCN outperforms state-of-the-art models on five publicly available benchmark datasets. The TWPGCN model is suitable for short text or ultra-short text, and the composition method in the model can also be extended to more fields.
Aspect Sentiment Triplet Extraction (ASTE) is a challenging task in natural language processing (NLP) that aims to extract triplets from comments. Each triplet comprises an aspect term, an opinion term, and the sentiment polarity of the aspect term. The neural network model developed for this task can enable robots to effectively identify and extract the most meaningful and relevant information from comment sentences, ultimately leading to better products and services for consumers. Most existing end-to-end models focus solely on learning the interactions between the three elements in a triplet and contextual words, ignoring the rich affective knowledge information contained in each word and paying insufficient attention to the relationships between multiple triplets in the same sentence. To address this gap, this study proposes a novel end-to-end model called the Dual Graph Convolutional Networks Integrating Affective Knowledge and Position Information (DGCNAP). This model jointly considers both the contextual features and the affective knowledge information by introducing the affective knowledge from SenticNet into the dependency graph construction of two parallel channels. In addition, a novel multi-target position-aware function is added to the graph convolutional network (GCN) to reduce the impact of noise information and capture the relationships between potential triplets in the same sentence by assigning greater positional weights to words that are in proximity to aspect or opinion terms. The experiment results on the ASTE-Data-V2 datasets demonstrate that our model outperforms other state-of-the-art models significantly, where the F1 scores on 14res, 14lap, 15res, and 16res are 70.72, 57.57, 61.19, and 69.58.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.