Proceedings of the Sixth Workshop on Noisy User-Generated Text (W-Nut 2020) 2020
DOI: 10.18653/v1/2020.wnut-1.40
|View full text |Cite
|
Sign up to set email alerts
|

BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models

Abstract: Recent improvements in machine-reading technologies attracted much attention to automation problems and their possibilities. In this context, WNUT 2020 introduces a Name Entity Recognition (NER) task based on wet laboratory procedures. In this paper, we present a 3-step method based on deep neural language models that reported the best overall exact match F 1 -score (77.99%) of the competition. By fine-tuning 10 times, 10 different pretrained language models, this work shows the advantage of having more models… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2

Relationship

6
0

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 11 publications
0
14
0
Order By: Relevance
“…Particularly, the ensemble of masked language models brought the highest performance gain to the search pipeline. Indeed, ensembles of language models have proved to be a robust methodology to improve predictive performance [ 53 - 55 ].…”
Section: Discussionmentioning
confidence: 99%
“…Particularly, the ensemble of masked language models brought the highest performance gain to the search pipeline. Indeed, ensembles of language models have proved to be a robust methodology to improve predictive performance [ 53 - 55 ].…”
Section: Discussionmentioning
confidence: 99%
“…Second, identifying textual evidence in publication to re-rank publications might have a positive impact on the literature triage [25, 26]. Finally, pre-trained language and ensemble learning models [27] could be opportunely used to provide the curator with a more focused evidence passage to support the curation work of mutation databases [18]…”
Section: Discussionmentioning
confidence: 99%
“…Second, identifying textual evidence in publication to re-rank publications might have a positive impact on the literature triage [25,26]. Finally, pre-trained language and ensemble learning models [27] could be opportunely used to provide the curator with a more focused evidence passage to support the curation work of mutation databases [18] To conclude, the system we developed has the potential to significantly propel variant curation. It is however to be noted that such a system is neither intended to replace human curators, nor clinical expertise, but rather to support these professionals by cutting down the cost of the manual triage of the literature.…”
Section: Discussionmentioning
confidence: 99%
“…Conversely, for the clinical NER, for which a token could be assigned to more than one entity, we used a sigmoid function to provide a multi-class classifier. More information about the fine-tuning of the models and the hyper-parameter settings can be found in [Copara et al, 2020b,a, Knafou et al, 2020].…”
Section: Methodsmentioning
confidence: 99%
“…Our ensemble method is based on a voting strategy, where each model votes with its predictions and a simple majority of votes is necessary to assign the predictions [Copara et al, 2020b,a, Knafou et al, 2020]. In other words, for a given document, our models infer their predictions independently for each entity.…”
Section: Methodsmentioning
confidence: 99%