2020
DOI: 10.1109/access.2020.3018688
|View full text |Cite
|
Sign up to set email alerts
|

Testing Contextualized Word Embeddings to Improve NER in Spanish Clinical Case Narratives

Abstract: In the Big Data era, there is an increasing need to fully exploit and analyze the huge quantity of information available about health. Natural Language Processing (NLP) technologies can contribute by extracting relevant information from unstructured data contained in Electronic Health Records (EHR) such as clinical notes, patients' discharge summaries and radiology reports. The extracted information can help in health-related decision making processes. The Named Entity Recognition (NER) task, which detects imp… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(14 citation statements)
references
References 35 publications
0
14
0
Order By: Relevance
“…Regarding the clinical domain in Spanish, we found the models Biomedical Roberta (Carrino et al, 2022) and SciELO Flair (Akhtyamova et al, 2020). In the first case, the main difference with our model is that Biomedical Roberta was trained on a corpus formed by several biomedical and clinical corpora, while we only used clinical narratives.…”
Section: Related Workmentioning
confidence: 98%
“…Regarding the clinical domain in Spanish, we found the models Biomedical Roberta (Carrino et al, 2022) and SciELO Flair (Akhtyamova et al, 2020). In the first case, the main difference with our model is that Biomedical Roberta was trained on a corpus formed by several biomedical and clinical corpora, while we only used clinical narratives.…”
Section: Related Workmentioning
confidence: 98%
“…These embeddings, however, were not intrinsically evaluated nor compared performance-wise with other embeddings, and they were not made available for use. In another work, Akhtyamova et al ( 2020 ) used the Flair (Akbik et al, 2019 ) and BERT (Devlin et al, 2018 ) models to calculate word embeddings for the Spanish clinical domain as part of a named entity recognition (NER) task and Rojas et al ( 2022 ) computed another Flair language model from clinical narratives in Spanish. These models utilize contextualized word embeddings that take into account the word context upon embedding calculation.…”
Section: Related Workmentioning
confidence: 99%
“…In non-English or low-resource language. Most works in biomedical pre-trained language models are with English corpora, and a few about Chinese [292], German [25], Japanese [102,245], Spanish [6,7,145,159], Korean [109], Russian [239], Italian [28], Arabic [13,23], French [41], Portuguese [208,209] etc. For the non-English biomedical tasks, there are two mainstream solutions: a single non-English language paradigm and a multi-linguistic paradigm.…”
Section: Future Trendsmentioning
confidence: 99%