Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021) 2021
DOI: 10.18653/v1/2021.semeval-1.85
|View full text |Cite
|
Sign up to set email alerts
|

JUST-BLUE at SemEval-2021 Task 1: Predicting Lexical Complexity using BERT and RoBERTa Pre-trained Language Models

Abstract: Predicting the complexity level of a word or a phrase is considered a challenging task. It is even recognized as a crucial step in numerous NLP applications, such as text rearrangements and text simplification. Early research treated the task as a binary classification task, where the systems anticipated the existence of a word's complexity (complex versus uncomplicated). Other studies had been designed to assess the level of word complexity using regression models or multi-labeling classification models. Deep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(15 citation statements)
references
References 12 publications
0
15
0
Order By: Relevance
“…We evaluate the results of our system by applying the regression metrics in the execution of the supervised learning algorithms, specifically MAE, MSE, RMSE, and R2. We emphasize that we apply the methodologies of the winning teams, which are based on the application of language models based on the pre-trained and adjusted Transformers BERT and RoBERTa [9,34,35], together with the linguistic, syntactic and statistical characteristics, and the embedding results at the word and sentence level.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We evaluate the results of our system by applying the regression metrics in the execution of the supervised learning algorithms, specifically MAE, MSE, RMSE, and R2. We emphasize that we apply the methodologies of the winning teams, which are based on the application of language models based on the pre-trained and adjusted Transformers BERT and RoBERTa [9,34,35], together with the linguistic, syntactic and statistical characteristics, and the embedding results at the word and sentence level.…”
Section: Resultsmentioning
confidence: 99%
“…Deep learning models are significantly improved over "shallow" machine learning models with the advent of transfer learning and pre-trained language models. The BERT and XLM-RoBERTa pre-trained deep learning language models are considered to be at the forefront of many NLP tasks [9].…”
Section: Introductionmentioning
confidence: 99%
“…Just Blue by Yaseen et al [149], achieved the highest Pearson's Correlation at LCP-2021's sub-task 1 of 0.7886 [118]. It was inspired by the prior state-of-the-art performance of ensemble-based models together with the recent headway in various NLP-related tasks made by transformers [149].…”
Section: 31mentioning
confidence: 99%
“…Pan et al [99] contributed their model's good performance in both sub-tasks to its use of multiple transformers and training strategies. With model diversity also being an influential f actor i n regards t o J ust B lue's high performance [149], it would appear that current state-of-the-art LCP systems consist of an ensemble of differing transformers-based models.…”
Section: 31mentioning
confidence: 99%
“…Like this team, the use of contextual embedding models stood out as a fundamental part of the presented systems. Such is the case of the JUST BLUE team [138] that leverages context information extracted from BERT and RoBERTa models, achieving the highest "Pearson's Correlation" score in the first task. Similarly, the RG_PA team [139] performs an assembly of RoBERTa models in its classification, obtaining the second highest Pearson's Correlation score in the second task.…”
Section: Substitute Ranking (Sr)mentioning
confidence: 99%