Transforming the task of information extraction into a machine reading comprehension (MRC) framework has shown promising results. The MRC model takes the context and query as the inputs to the encoder, and the decoder extracts one or more text spans as answers (entities and relationships) from the text. Existing approaches typically use multi-layer encoders, such as Transformers, to generate hidden features of the source sequence. However, increasing the number of encoder layers can lead to the granularity of the representation becoming coarser and the hidden features of different words becoming more similar, potentially leading to the model’s misjudgment. To address this issue, a new method called the multi-granularity attention multi-scale self-learning network (MAML-NET) is proposed, which enhances the model’s understanding ability by utilizing different granularity representations of the source sequence. Additionally, MAML-NET can independently learn task-related information from both global and local dimensions based on the learned multi-granularity features through the proposed multi-scale self-learning attention mechanism. The experimental results on two information extraction tasks, named entity recognition and entity relationship extraction, demonstrated that the method was superior to the method based on machine reading comprehension and achieved the best performance on the five benchmark tests.
When an entity contains one or more entities, these particular entities are referred to as nested entities. The Layered BiLSTM-CRF model can use multiple BiLSTM layers to identify nested entities. However, as the number of layers increases, the number of labels that the model can learn decreases, and it may not even predict any entities, thereby causing the model to stop stacking. Furthermore, the model will be constrained by the one-way propagation of information from the lower layer to the higher layer. The incorrect entities extracted by the outer layer will affect the performance of the inner layer. We propose a novel neural network for nested named entity recognition (NER) that dynamically stacks flat NER layers to address these issues. Each flat NER layer captures contextual information based on a pretrained model with more robust feature extraction capabilities. The model parameters of a flat NER layer and its input are entirely independent. The input of each layer is all of the word representations generated by the input sequence through the embedding layer. The independent input ensures that different flat NER layers will not be interfered with by other flat NER layers during model training and testing to reduce error propagation. Experiments show that our model obtains F1 scores of 76.9%, 78.1%, and 78.0% on the ACE2004, ACE2005, and GENIA datasets, respectively.INDEX TERMS nested named entity recognition, pretrained model, natural language processing.
In recent years, many scholars have chosen to use word lexicons to incorporate word information into a model based on character input to improve the performance of Chinese relation extraction (RE). For example, Li et al. proposed the MG-Lattice model in 2019 and achieved state-of-the-art (SOTA) results. However, MG-Lattice still has the problem of information loss due to its model structure, which affects the performance of Chinese RE. This paper proposes an adaptive method to include word information at the embedding layer using a word lexicon to merge all words that match each character into a character input-based model to solve the information loss problem of MG-Lattice. The method can be combined with other general neural system networks and has transferability. Experimental studies on two benchmark Chinese RE datasets show that our method achieves an inference speed up to 12.9 times faster than the SOTA model, along with a better performance. The experimental results also show that this method combined with the BERT pretrained model can effectively supplement the information obtained from the pretrained model, further improving the performance of Chinese RE.
Topic models can extract consistent themes from large corpora for research purposes. In recent years, the combination of pretrained language models and neural topic models has gained attention among scholars. However, this approach has some drawbacks: in short texts, the quality of the topics obtained by the models is low and incoherent, which is caused by the reduced word frequency (insufficient word co-occurrence) in short texts compared to long texts. To address these issues, we propose a neural topic model based on SBERT and data augmentation. First, our proposed easy data augmentation (EDA) method with keyword combination helps overcome the sparsity problem in short texts. Then, the attention mechanism is used to focus on keywords related to the topic and reduce the impact of noise words. Next, the SBERT model is trained on a large and diverse dataset, which can generate high-quality semantic information vectors for short texts. Finally, we perform feature fusion on the augmented data that have been weighted by an attention mechanism with the high-quality semantic information obtained. Then, the fused features are input into a neural topic model to obtain high-quality topics. The experimental results on an English public dataset show that our model generates high-quality topics, with the average scores improving by 2.5% for topic coherence and 1.2% for topic diversity compared to the baseline model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.