Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter

Liu, Wei; Fu, Xiyan; Zhang, Yue; Xiao, Wenming

doi:10.48550/arxiv.2105.07148

Cited by 17 publications

(23 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the public datasets, the original test and validation sets are preserved and used in the respective evaluation and validation stages. From the analysis of the experimental results, it is evident that our proposed model exhibits superior performance in comparison to traditional models, including BiLSTM-CRF, Lattice LSTM [20] , Lexicon [21] , BERT, and BERT-CRF, particularly in low-resource scenarios. Across all four datasets, our model consistently outperforms the widely-utilized BERT-CRF model, achieving an average increase of 3% in terms of the F1 score.…”

Section: Methodsmentioning

confidence: 98%

A Low-Resource Chinese Named Entity Recognition Method Combining Chinese Glyph Features and Contrastive Learning

Gao

Tian

et al. 2023

Preprint

View full text Add to dashboard Cite

Named Entity Recognition (NER) represents a pivotal research area in the domain of natural language processing, yet the effective utilization of Chinese information remains a significant challenge. Moreover, NER tasks often suffer from limited data availability, data with varying labeling quality, and potential ethical concerns. To address these challenges, we propose a novel approach for low-resource Chinese named entity recognition by leveraging Chinese glyph features and contrastive learning. Our method effectively enhances the accuracy of named entity recognition. Through extensive experimentation, we demonstrate the efficacy of our approach on both the low-resource medical dataset for esophageal cancer and general Chinese dataset. Our model outperforms the widely adopted BERT-CRF model on the medical dataset, achieving a precision improvement of 3.12\%, a recall improvement of 1.73\%, and an F1 score improvement of 2.62\%. Notably, our core contrastive learning framework can be applied not only to the BERT model but also to the majority of Chinese NER task models, exhibiting its versatility and potential impact on the broader NER research field.

show abstract

Section: Methodsmentioning

confidence: 98%

A Low-Resource Chinese Named Entity Recognition Method Combining Chinese Glyph Features and Contrastive Learning

Gao

Tian

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Lattice LSTM (Zhang and Yang, 2018) 89.37 90.84 90.10 BERT-CRF (Devlin et al, 2018) 88.46 92.35 90.37 ERNIE (Zhang et al, 2019) 88.87 92.27 90.53 FLAT (Li et al, 2020) 88.76 92.07 90.38 LEBERT (Liu et al, 2021) 86.53 92.91 89.60…”

Section: Dialoamc As a New Benchmarkmentioning

confidence: 99%

“…Experimental settings We use several popular Chinese named entity models as baselines, including: 1) Lattice LSTM (Zhang and Yang, 2018), an extension of Char-LSTM that incorporates lexical information into native LSTM; 2) BERT (Devlin et al, 2018), a bidirectional Transformer encoder with large-scale language pre-training; 3) ERNIE (Zhang et al, 2019); an improved BERT that adopts entity-level masking and phraselevel masking during pre-training; 4) FLAT (Li et al, 2020), a flat-lattice Transformer that converts the lattice structure into a flat structure consisting of spans; 5) LEBERT (Liu et al, 2021), a lexicon enhanced BERT for Chinese sequence labelling, which integrates external lexicon knowledge into BERT layers by a lexicon adapter layer. We train each model for 10 epochs, using the default parameters of the corresponding code repository.…”

Section: Named Entity Recognitionmentioning

confidence: 99%

A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets

Chen¹,

Li²,

Hongyi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Motivation:In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience. In this paper, we propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and taskoriented interaction. A new large medical dialogue dataset with multi-level fine-grained annotations is introduced and five independent tasks are established, including named entity recognition, dialogue act classification, symptom label inference, medical report generation and diagnosis-oriented dialogue policy. Results: We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.

show abstract

“…Consider that the emission score actually reflects the capability of the prepositive encoder, which is, the encoder can not perfectly generalize its ability to unseen words (OOV); thus, the emission score can be biased. In other words, as clearly analyzed in Wei et al (2021), the author states that without a hard mechanism to enforce the transition rule, the conventional CRF can result in the occasional occurrence of illegal predictions, which indicates that it is possible to lead to wrong tag paths under the current decode framework.…”

Section: Pcrf Inference Layermentioning

confidence: 99%

“…Ke et al (2020) present a CWS-specific pre-trained model, which employs a unified architecture to make use of segmentation knowledge of different criteria. Liu et al (2021) raise a lexicon enhanced BERT, which combines the character and lexicon features as the input. Besides, it attaches a lexicon adapter between the Transformer layers to integrate lexicon knowledge into BERT.…”

Section: Introductionmentioning

confidence: 99%

Green CWS: Extreme Distillation and Efficient Decode Method Towards Industrial Application

Hu¹,

Liu²

2021

Preprint

View full text Add to dashboard Cite

Benefiting from the strong ability of the pre-trained model, the research on Chinese Word Segmentation (CWS) has made great progress in recent years. However, due to massive computation, large and complex models are incapable of empowering their ability for industrial use. On the other hand, for low-resource scenarios, the prevalent decode method, such as Conditional Random Field (CRF), fails to exploit the full information of the training data. This work proposes a fast and accurate CWS framework that incorporates a light-weighted model and an upgraded decode method (PCRF) towards industrially low-resource CWS scenarios. First, we distill a Transformer-based student model as an encoder, which not only accelerates the inference speed but also combines open knowledge and domain-specific knowledge. Second, the perplexity score to evaluate the language model is fused into the CRF module to better identify the word boundaries. Experiments show that our work obtains relatively high performance on multiple datasets with as low as 14% of time consumption compared with the original BERT-based model. Moreover, under the low-resource setting, we get superior results in comparison with the traditional decoding methods.

show abstract

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter

Cited by 17 publications

References 37 publications

A Low-Resource Chinese Named Entity Recognition Method Combining Chinese Glyph Features and Contrastive Learning

A Low-Resource Chinese Named Entity Recognition Method Combining Chinese Glyph Features and Contrastive Learning

A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets

Green CWS: Extreme Distillation and Efficient Decode Method Towards Industrial Application

Contact Info

Product

Resources

About