Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.379
|View full text |Cite
|
Sign up to set email alerts
|

BioMegatron: Larger Biomedical Domain Language Model

Abstract: There has been an influx of biomedical domain-specific language models, showing language models pre-trained on biomedical text perform better on biomedical domain benchmarks than those trained on general domain text corpora such as Wikipedia and Books. Yet, most works do not study the factors affecting each domain language application deeply. Additionally, the study of model size on domain-specific models has been mostly missing. We empirically study and evaluate several factors that can affect performance on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
59
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 81 publications
(60 citation statements)
references
References 16 publications
1
59
0
Order By: Relevance
“…Domain-specific pretraining on biomedical corpora (e.g. BIOBERT, Lee et al 2020 and BIOMEGA-TRON, Shin et al 2020) have made much progress in biomedical text mining tasks. Nonetheless, representing medical entities with the existing SOTA pretrained MLMs (e.g.…”
Section: Pubmedbert + Sapbertmentioning
confidence: 99%
“…Domain-specific pretraining on biomedical corpora (e.g. BIOBERT, Lee et al 2020 and BIOMEGA-TRON, Shin et al 2020) have made much progress in biomedical text mining tasks. Nonetheless, representing medical entities with the existing SOTA pretrained MLMs (e.g.…”
Section: Pubmedbert + Sapbertmentioning
confidence: 99%
“…BioMegaTron 345m (Shin et al, 2020) is a largescale model (345m parameters) by NVIDIA based on MegaTron architecture. .…”
Section: Biomedical Language Modelsmentioning
confidence: 99%
“…Our choices of downstream biomedical tasks are similar to (Shin et al, 2020). For Named Entity Recognition (NER) and Relation Extraction (RE), we generate our training, development, and test data using the same script that PubMedBERT uses (Gu et al, 2021).…”
Section: Downstream Tasksmentioning
confidence: 99%
See 2 more Smart Citations