Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Mention detection is an important component of the Coreference Resolution (CR) system, where mentions such as name, nominal, and pronominals are identified. These mentions can be purely coreferential mentions or singleton mentions (non-coreferential mentions). Coreferential mentions are those mentions in a text that refer to the same entities in the real world. Whereas, singleton mentions are mentioned only once in the text and do not participate in the coreference as they are not mentioned again in the following text. Filtering of these singleton mentions can substantially improve the performance of a CR process. This paper proposes a singleton mention detection module based on a Fully Connected Network (FCN) and a Long Short-Term Memory for Hindi text and model identifying singleton mentions so that these mentions can be filtered out to reduce the search space for CR. A CR system can look for the previous reference of that mention in the text and if these mentions are removed from the list of mentions, then it reduces the searching time and also space time. This model utilizes a few hand-crafted features, context information, and embedding for words from word2vec and a multilingual Bidirectional Encoder Representations from Transformers (mBERT) language model. The coreference annotated Hindi dataset comprising 3.6K sentences, and 78K tokens are used for the task. The singleton mention detection model is analyzed extensively by experimenting with various lengths of context windows for each mention. The performance of the model is significant with two window sizes of context as compared to other various window sizes of contexts such as 2,3,4,5, etc., and all previous and all next words of each mention. The Precision, Recall, and F-measure of the LSTM-FCN model with mBERT (Word + Context + Syntactic) with two window sizes of context for identifying the singleton mentions are 63%, 71%, and 67% respectively.
Mention detection is an important component of the Coreference Resolution (CR) system, where mentions such as name, nominal, and pronominals are identified. These mentions can be purely coreferential mentions or singleton mentions (non-coreferential mentions). Coreferential mentions are those mentions in a text that refer to the same entities in the real world. Whereas, singleton mentions are mentioned only once in the text and do not participate in the coreference as they are not mentioned again in the following text. Filtering of these singleton mentions can substantially improve the performance of a CR process. This paper proposes a singleton mention detection module based on a Fully Connected Network (FCN) and a Long Short-Term Memory for Hindi text and model identifying singleton mentions so that these mentions can be filtered out to reduce the search space for CR. A CR system can look for the previous reference of that mention in the text and if these mentions are removed from the list of mentions, then it reduces the searching time and also space time. This model utilizes a few hand-crafted features, context information, and embedding for words from word2vec and a multilingual Bidirectional Encoder Representations from Transformers (mBERT) language model. The coreference annotated Hindi dataset comprising 3.6K sentences, and 78K tokens are used for the task. The singleton mention detection model is analyzed extensively by experimenting with various lengths of context windows for each mention. The performance of the model is significant with two window sizes of context as compared to other various window sizes of contexts such as 2,3,4,5, etc., and all previous and all next words of each mention. The Precision, Recall, and F-measure of the LSTM-FCN model with mBERT (Word + Context + Syntactic) with two window sizes of context for identifying the singleton mentions are 63%, 71%, and 67% respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.