2022
DOI: 10.1038/s41598-022-17806-8
|View full text |Cite|
|
Sign up to set email alerts
|

A pre-trained BERT for Korean medical natural language processing

Abstract: With advances in deep learning and natural language processing (NLP), the analysis of medical texts is becoming increasingly important. Nonetheless, despite the importance of processing medical texts, no research on Korean medical-specific language models has been conducted. The Korean medical text is highly difficult to analyze because of the agglutinative characteristics of the language, as well as the complex terminologies in the medical domain. To solve this problem, we collected a Korean medical corpus an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…Emilien et al developed the NLP system for learning contextual embeddings from free-text form clinical records in France [15]. Kim Y et al developed an NLP system for identifying a Korean medical corpus using BERT models [37]. Dahl et al developed an NLP system for classifying Norwegian pediatric CT radiology reports.…”
Section: Discussionmentioning
confidence: 99%
“…Emilien et al developed the NLP system for learning contextual embeddings from free-text form clinical records in France [15]. Kim Y et al developed an NLP system for identifying a Korean medical corpus using BERT models [37]. Dahl et al developed an NLP system for classifying Norwegian pediatric CT radiology reports.…”
Section: Discussionmentioning
confidence: 99%
“…Nevertheless, the trends observed within this set of keywords are also reflected in the analysis provided in the following sections. [23], construction of cohorts of similar patients [24], processing of electronic medical records [25], understanding of patient's answers in a French medical chatbot [26]; • German: evaluation of Transformers on clinical notes [27]; • Greek: improving the performance of localized healthcare virtual assistants [28]; • Hindi: classification of COVID-19 texts [29], chatbot for information sexual and reproductive health for young people [30]; • Italian: analysis of social media for quality of life in Parkinson's patients [31], sentiment analysis of opinion on COVID-19 vaccines [32,33], estimation of the incidence of infectious disease cases [34]; • Japanese: understanding psychiatric illness [35], detection of adverse events from narrative clinical documents [36]; • Korean: BERT model for processing med-ical documents [37], sentiment analysis of tweets about COVID-19 vaccines [38];…”
Section: Analysis Of Abstract From Publicationsmentioning
confidence: 99%
“…and institutions (like MIMIC-III), as well as data from social media, hospitals, bibliographical datasets, clinical trials, etc. The research in other languages is possible mainly thanks to the availability of data from social media [7,9,19,20,22,38,43,47] and documents from local hospitals [10,13,14,17,18,23,25,27,36,37,40,42]. Besides, this set of works in languages other than English relies on the dedicated language models, which cover a great variety of languages by now.…”
Section: Languages Addressedmentioning
confidence: 99%
“…We report biomedical and clinical language-specific PLMs that have become available since 2020 in Table 2. Most models are initialized with the weights of their language-specific general-domain counterparts [41][42][43] which has been the go-to method for the creation of domain-specific models. However, the general-domain vocabulary created within the model during pretraining is not representative of the biomedical or clinical domain and can endanger the downstream performance.…”
Section: Pretrained Language Models For Loementioning
confidence: 99%