2021
DOI: 10.1145/3458754
|View full text |Cite
|
Sign up to set email alerts
|

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Abstract: Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scrat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

8
514
2
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 860 publications
(680 citation statements)
references
References 50 publications
8
514
2
1
Order By: Relevance
“…Using PTLMs available within the HuggingFace Transformers library, we will experiment with variations of BERT models to determine which have the best performance in article classification. These will include BERT [ 23 ], BioBERT [ 45 ], BlueBERT [ 46 ], and PubMedBERT [ 47 ]. These models differ in the pretraining text domain.…”
Section: Methodsmentioning
confidence: 99%
“…Using PTLMs available within the HuggingFace Transformers library, we will experiment with variations of BERT models to determine which have the best performance in article classification. These will include BERT [ 23 ], BioBERT [ 45 ], BlueBERT [ 46 ], and PubMedBERT [ 47 ]. These models differ in the pretraining text domain.…”
Section: Methodsmentioning
confidence: 99%
“…PubMedBERT is a BERT model that pre-trained on biomedical text from the scratch by Microsoft research team. The assumption is that pre-training the BERT model solely on the text would perform better than general-domain text (10). PubMedBERT outperformed all prior language models and obtained new SOTA results in a wide range of biomedical applications (10).…”
Section: A Chemical Named Entity Recognitionmentioning
confidence: 99%
“…The assumption is that pre-training the BERT model solely on the text would perform better than general-domain text (10). PubMedBERT outperformed all prior language models and obtained new SOTA results in a wide range of biomedical applications (10). We chose to use PubMedBERT as the base model for chemical NER task.…”
Section: A Chemical Named Entity Recognitionmentioning
confidence: 99%
See 2 more Smart Citations