Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.54
|View full text |Cite
|
Sign up to set email alerts
|

Named Entity Recognition for Chinese biomedical patents

Abstract: There is a large body of work on Biomedical Entity Recognition (Bio-NER) for English, but there have only been a few attempts addressing NER for Chinese biomedical texts. Because of the growing amount of Chinese biomedical discoveries being patented, and lack of NER models for patent data, we train and evaluate NER models for the analysis of Chinese biomedical patent data, based on BERT. By doing so, we show the value and potential of this domain-specific NER task. For the evaluation of our methods we built ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(11 citation statements)
references
References 16 publications
0
11
0
Order By: Relevance
“…BERT-BiLSTM-CRF (Huang et al, 2015) is an established model for sequence tagging (and the state-of-the-art for name entity recognition in different languages [Huang et al, 2015;Hu and Verberne, 2020]), which uses a BiLSTM to encode the sequence information and then performs sequence tagging with a conditional random field (CRF).…”
Section: Methodsmentioning
confidence: 99%
“…BERT-BiLSTM-CRF (Huang et al, 2015) is an established model for sequence tagging (and the state-of-the-art for name entity recognition in different languages [Huang et al, 2015;Hu and Verberne, 2020]), which uses a BiLSTM to encode the sequence information and then performs sequence tagging with a conditional random field (CRF).…”
Section: Methodsmentioning
confidence: 99%
“…However, incorrect word segmentation will result in propagation errors in entity detection, and purely char-based approach will miss the word information. Pre-trained language models, such as BERT, can generate contextual embedding which can outperform other character or word-based approaches (Hu et al, 2020). More importantly, BPE subword segmentation method is employed by BERT-based models and word-level information is not explicitly modeled.…”
Section: Chinese Nermentioning
confidence: 99%
“…Yang (2019) simply added a softmax on BERT, achieving state-of-the-art performance on CWS. Meng et al (2019); Hu and Verberne (2020) showed that models using the character features from BERT outperform the static embedding-based approaches by a large margin for Chinese NER and Chinese POS tagging.…”
Section: Related Workmentioning
confidence: 99%