Named entity recognition for Chinese judgment documents based on BiLSTM and CRF

Huang, Wenming; Hu, Dengrui; Deng, Zhenrong; Nie, Jian‐Yun

doi:10.1186/s13640-020-00539-x

Cited by 13 publications

(9 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…BiLSTM is a bidirectional LSTM model, that is, a neural network that combines forward LSTM and backward LSTM. Through two-way propagation, BiLSTM can obtain the coding information from back to front and capture the context relationship through two-way coding 48 . The structure of the BiLSTM model is shown in Fig.…”

Section: Methodsmentioning

confidence: 99%

Deep learning-based methods for natural hazard named entity recognition

Sun

Liu

Cui

et al. 2022

Sci Rep

View full text Add to dashboard Cite

Natural hazard named entity recognition is a technique used to recognize natural hazard entities from a large number of texts. The method of natural hazard named entity recognition can facilitate acquisition of natural hazards information and provide reference for natural hazard mitigation. The method of named entity recognition has many challenges, such as fast change, multiple types and various forms of named entities. This can introduce difficulties in research of natural hazard named entity recognition. To address the above problem, this paper constructed a natural disaster annotated corpus for training and evaluation model, and selected and compared several deep learning methods based on word vector features. A deep learning method for natural hazard named entity recognition can automatically mine text features and reduce the dependence on manual rules. This paper compares and analyzes the deep learning models from three aspects: pretraining, feature extraction and decoding. A natural hazard named entity recognition method based on deep learning is proposed, namely XLNet-BiLSTM-CRF model. Finally, the research hotspots of natural hazards papers in the past 10 years were obtained through this model. After training, the precision of the XLNet-BilSTM-CRF model is 92.80%, the recall rate is 91.74%, and the F1-score is 92.27%. The results show that this method, which is superior to other methods, can effectively recognize natural hazard named entities.

show abstract

Section: Methodsmentioning

confidence: 99%

Deep learning-based methods for natural hazard named entity recognition

Sun

Liu

Cui

et al. 2022

Sci Rep

View full text Add to dashboard Cite

show abstract

“…Although IE systems have a long history in the scientific literature, there are few studies that analyze their use in commercial NLP pipelines. There are legal NER datasets similar to the ones used in this work [3,2,19,17], but they rarely reflect the complexities of production pipelines, such as processing scanned documents with OCR errors and extracting nested and discontiguous entities.…”

Section: Related Workmentioning

confidence: 99%

Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents

Pires¹,

Souza²,

Rosa³

et al. 2022

Preprint

View full text Add to dashboard Cite

A typical information extraction pipeline consists of tokenor span-level classification models coupled with a series of pre-and postprocessing scripts. In a production pipeline, requirements often change, with classes being added and removed, which leads to nontrivial modifications to the source code and the possible introduction of bugs. In this work, we evaluate sequence-to-sequence models as an alternative to token-level classification methods for information extraction of legal and registration documents. We finetune models that jointly extract the information and generate the output already in a structured format. Post-processing steps are learned during training, thus eliminating the need for rule-based methods and simplifying the pipeline. Furthermore, we propose a novel method to align the output with the input text, thus facilitating system inspection and auditing. Our experiments on four real-world datasets show that the proposed method is an alternative to classical pipelines.

show abstract

“…As such, one multi-task learning architecture has been designed jointly with two aims, recognizing the entities and their types simultaneously. e work in [13] proposed the combination of utilizing character and sentence vectors trained by distributed memory model of paragraph vectors (PV-DM). Additionally, the BiLSTM model was implemented with an additional conditional random field (CRF) layer, while this CRF layer is used to tag the input sentence.…”

Section: Entity Identificationmentioning

confidence: 99%

Transfer Learning on Knowledge Graph Construction: A Case Study of Investigating Gas-Mining Risk Report

Wang

Wei³

et al. 2022

Mathematical Problems in Engineering

View full text Add to dashboard Cite

This study addressed the problem of automated Knowledge Graph (KG) construction from unstructured documents, with the assistance of transfer learning. Despite a large amount of effort made to discover KG, how to explore unknown KGs from existing knowledge remains a challenge. In this paper, we firstly formulate the KG detection process as a transfer-learning problem, which consists of two main steps. At first, we pretrain a backbone model using the source domain. Due to sufficient samples from the source domain, this backbone model can be trained better. Second, we migrate this model (from the known domain) to the target domain by fine-tuning key parameters. The fine-tuning operation only requires less computation, which is very efficient. As such, the backbone model can be successfully transferred into the target domain, even with limit training samples. Experimental evaluations using one real-world dataset of gas-mining reports demonstrate the advantages of utilizing the proposed algorithm to construct KG using transferable information.

show abstract

Named entity recognition for Chinese judgment documents based on BiLSTM and CRF

Cited by 13 publications

References 23 publications

Deep learning-based methods for natural hazard named entity recognition

Deep learning-based methods for natural hazard named entity recognition

Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents

Transfer Learning on Knowledge Graph Construction: A Case Study of Investigating Gas-Mining Risk Report

Contact Info

Product

Resources

About