2021
DOI: 10.1155/2021/3544281
|View full text |Cite
|
Sign up to set email alerts
|

TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition

Abstract: Intelligent traditional Chinese medicine (TCM) has become a popular research field by means of prospering of deep learning technology. Important achievements have been made in such representative tasks as automatic diagnosis of TCM syndromes and diseases and generation of TCM herbal prescriptions. However, one unavoidable issue that still hinders its progress is the lack of labeled samples, i.e., the TCM medical records. As an efficient tool, the named entity recognition (NER) models trained on various TCM res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…For the TCM NER task, we chose several typical methods with which to compare the proposed approach. We selected two advanced methods: the BiLSTM-CRF architecture [ 4 ], BERT-BiLSTM-CRF architecture [ 13 ], and a variant of the BERT model architecture that can fuse multigranularity information of text (Roberta-c) [ 33 ]. These methods are described as follows.…”
Section: Experimental Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…For the TCM NER task, we chose several typical methods with which to compare the proposed approach. We selected two advanced methods: the BiLSTM-CRF architecture [ 4 ], BERT-BiLSTM-CRF architecture [ 13 ], and a variant of the BERT model architecture that can fuse multigranularity information of text (Roberta-c) [ 33 ]. These methods are described as follows.…”
Section: Experimental Analysismentioning
confidence: 99%
“…Roberta-c [ 33 ] is a variant of the BERT model and outperforms the BERT model in Chinese NLP tasks. C denotes a word-character set in a self-attention module and it can integrate character and word information.…”
Section: Experimental Analysismentioning
confidence: 99%
“…NER is widely used in Chinese medical texts, involving electronic medical records [ 18 , 19 ], traditional Chinese medicine texts [ 20 , 21 ], clinical guidelines [ 22 ], disease subtypes [ 23 ], admission notes [ 24 ], PICO [ 25 ] disease and plant name [ 26 ], e. tc. Most approaches focus on feature extraction either by rule and dictionary-based means or by machine learning and deep learning means [ 27 , 28 ].…”
Section: Related Workmentioning
confidence: 99%
“…This problem makes many named entity recognition models designed for English corpus ineffective in the dataset of ancient Chinese medicine books. In view of the unclear boundary of Chinese medicine entities, the existing Chinese medicine named entity recognition methods mainly use combined characters and word embedding as the input of the model [2,3] to help the model more accurately model word features. In these methods, the word acquisition method is the Chinese word segmentation tool, while the text language of ancient Chinese medicine books is mostly ancient Chinese, which has a large gap with modern Chinese, leading to the occurrence of incorrect word segmentation, and thus affecting the performance of the model.…”
Section: Introductionmentioning
confidence: 99%