Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task 2021
DOI: 10.18653/v1/2021.smm4h-1.5
|View full text |Cite
|
Sign up to set email alerts
|

BERT based Transformers lead the way in Extraction of Health Information from Social Media

Abstract: This paper describes our submissions for the Social Media Mining for Health (SMM4H) 2021 shared tasks. We participated in 2 tasks:(1) Classification, extraction and normalization of adverse drug effect (ADE) mentions in English tweets (Task-1) and ( 2) Classification of COVID-19 tweets containing symptoms (Task-6). Our approach for the first task uses the language representation model RoBERTa with a binary classification head. For the second task, we use BERTweet, based on RoBERTa. Fine-tuning is performed on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…The parser pipeline achieved an accuracy of 95.1% on the OntoNotes 5.0 corpus. The en_core_web_trf dependency parsing pipeline has found application in various research studies for the initial reduction in irrelevant information [47][48][49]. This utilisation helped to reduce the model's processing time when dealing with extensive datasets and is concurrently leading to improvements in overall accuracies.…”
Section: Dependency Parsingmentioning
confidence: 99%
“…The parser pipeline achieved an accuracy of 95.1% on the OntoNotes 5.0 corpus. The en_core_web_trf dependency parsing pipeline has found application in various research studies for the initial reduction in irrelevant information [47][48][49]. This utilisation helped to reduce the model's processing time when dealing with extensive datasets and is concurrently leading to improvements in overall accuracies.…”
Section: Dependency Parsingmentioning
confidence: 99%
“…Table 5 shows the performance of uni-modal and bi-modal models on English tweets. Since there are strict limits with 3 submissions per day for the official 6, our official SMM4H 2021 submission combined the results of ten models with the same settings using a simple voting scheme achieved the results on par with the best-performing team [43] that utilized the RoBERTa model with undersampling and oversampling.…”
Section: Ade Text Classificationmentioning
confidence: 99%
“…Text formatting was initially done to extract the therapeutics from physician notes using Named Entity Recognition(NER), details of which are provided in (Supplementary Figure 1). [21] The encrypted information ([**word**]) along with single-letter words (example F, M etc), present in the MIMIC data were removed from the text. Text was converted to lowercase followed by standard pre-processing such as removal of whitespaces, stopwords, punctuations, digits, and words with less than or equal to two letters.…”
Section: Shockmodes Pipelinementioning
confidence: 99%