2020
DOI: 10.48550/arxiv.2009.09223
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

Abstract: In recent years, with the growing amount of biomedical documents, coupled with advancement in natural language processing algorithms, the research on biomedical named entity recognition (BioNER) has increased exponentially. However, BioNER research is challenging as NER in the biomedical domain are: (i) often restricted due to limited amount of training data, (ii) an entity can refer to multiple types and concepts depending on its context and, (iii) heavy reliance on acronyms that are sub-domain specific. Exis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…BioBERT [121], BlueBERT [179], SciBERT [16], BioELMo [93], PubMedBERT [66], BioMegatron [215],Yuan et al [286], Alsentze et al [10], Singh et al [206], Zhu et al [301], Si et al [216], Sheikhshab et al [213], Khan et al [103], Giorgi et al [63], Naseem [168], Gao et al [59], Poerner et al [186], Sun et al [229], for Spanish [6,75,159],…”
Section: Named Entity Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…BioBERT [121], BlueBERT [179], SciBERT [16], BioELMo [93], PubMedBERT [66], BioMegatron [215],Yuan et al [286], Alsentze et al [10], Singh et al [206], Zhu et al [301], Si et al [216], Sheikhshab et al [213], Khan et al [103], Giorgi et al [63], Naseem [168], Gao et al [59], Poerner et al [186], Sun et al [229], for Spanish [6,75,159],…”
Section: Named Entity Recognitionmentioning
confidence: 99%
“…It utilized the BioBert as the transformer encoder layer and the multiple data sets in the task-specific layers. With the development of general domain language models, Naseem [168] proposed an effective domain-specific language model bioALBERT trained on the biomedical domain corpora (PubMed abstracts and PMC full-text articles) for biomedical named entity recognition. Giorgi et al [63] proposed the end-to-end model for jointly extracting named entities and their relations using the pre-trained language model BERT.…”
Section: Named Entity Recognition Biomedical Named Entity Recognition...mentioning
confidence: 99%
“…Table 1 summarises a number of datasets previously used to evaluate Pre-trained LMs on various BioNLP tasks. Our previous preliminary work has shown the potential of designing a customised domain-specific LM outperforming SOTA in NER tasks 16 .…”
Section: Background and Summarymentioning
confidence: 99%
“…One limitation of transformer-based models is the inability to capture information that is specific to a domain. To improve results in specialized domains, several transformer-based LMs such as Biomedical BERT (BioBERT) [18], Biomedical A Lite Bidirectional Encoder Representations from Transformers (BioALBERT) [32], Twitter BERT (BertTweet) [38] and Covid-BERT (CT-BERT) [30], were trained on domainspecific corpora using the same unsupervised training method used in general models. Given how domain-specific LMs have improved performance for specific NLP tasks, we expect that domain-specific LMs should improve the vaccine sentiment classification task.…”
Section: Introductionmentioning
confidence: 99%