Document-level relation extraction aims to extract the relationship among the entities in a paragraph of text. Compared with sentence-level, the text in document-level relation extraction is much longer and contains many more entities. It makes the document-level relation extraction a harder task. The number and complexity of entities make it necessary to provide enough information about the entities for the models in document-level relation extraction. To solve this problem, we put forward a document-level entity mask method with type information (DEMMT), which masks each mention of the entities by special tokens. By using this entity mask method, the model can accurately obtain every mention and type of the entities. Based on DEMMT, we propose a BERT-based one-pass model, through which we can predict the relationships among the entities by processing the text once. We test the proposed model on the DocRED dataset, which is a large scale open-domain document-level relation extraction dataset. The results on the manually annotated part of DocRED show that our approach obtains 6% F1 improvement compared with the state-of-the-art models that do not use pre-trained models and has 2% F1 improvement than BERT which does not use the DEMMT. On the distant supervision generated part of DocRED, the improvement of F1 is 2% compared with no pre-trained models, and 5% compared with pure BERT.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.