11th Hellenic Conference on Artificial Intelligence 2020
DOI: 10.1145/3411408.3411440
|View full text |Cite
|
Sign up to set email alerts
|

GREEK-BERT: The Greeks visiting Sesame Street

Abstract: Transformer-based language models, such as BERT and its variants, have achieved state-of-the-art performance in several downstream natural language processing (NLP) tasks on generic benchmark datasets (e.g., GLUE, SQUAD, RACE). However, these models have mostly been applied to the resource-rich English language. In this paper, we present GREEK-BERT, a monolingual BERT-based language model for modern Greek. We evaluate its performance in three NLP tasks, i.e., part-of-speech tagging, named entity recognition, a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(31 citation statements)
references
References 25 publications
0
21
0
Order By: Relevance
“…We use the pre-trained cased and uncased multilingual BERT model (Devlin et al, 2019) and report results of the best variant for each language. We also report results obtained with four monolingual models: bert-base-uncased (Devlin et al, 2019), flaubert_base_uncased (Le et al, 2020), bert-base-spanish-wwmuncased (Cañete et al, 2020), and bert-basegreek-uncased-v1 (Koutsikakis et al, 2020). We compare to results obtained using fastText static embeddings in each language (Grave et al, 2018).…”
Section: Methodology Modelsmentioning
confidence: 98%
“…We use the pre-trained cased and uncased multilingual BERT model (Devlin et al, 2019) and report results of the best variant for each language. We also report results obtained with four monolingual models: bert-base-uncased (Devlin et al, 2019), flaubert_base_uncased (Le et al, 2020), bert-base-spanish-wwmuncased (Cañete et al, 2020), and bert-basegreek-uncased-v1 (Koutsikakis et al, 2020). We compare to results obtained using fastText static embeddings in each language (Grave et al, 2018).…”
Section: Methodology Modelsmentioning
confidence: 98%
“…Very recently, researchers employed transformer-based language models, such as BERT, and achieved state-of-the-art performance in several NLP tasks. The GreekBERT model [54] has been pre-trained on 29 GB of text from the Greek part of (i) Wikipedia, (ii) the European Parliament Proceedings Parallel Corpus (Europarl), and (iii) of OSCAR [55], a clean version of Common Crawl [56]. Finally, a GPT-2 model for Greek trained on a large text corpus (almost 5GB) from Wikipedia, is also available online [57].…”
Section: Greek Embeddingsmentioning
confidence: 99%
“…In addition to PaloBERT, we decided to further train an existing language model on the collected dataset. More specifically, we selected GreekBERT [54] (Section 2.3) and we trained it for an additional 10 epochs (Figure 6) on our social media corpus, on an AWS p2.xlarge instance running the Deep Learning AMI (Amazon Linux) Version 48.0. The resulting model, which is further mentioned as GreekSocialBERT, is also available for download at HuggingFace's model repository [73].…”
Section: Datasets Algorithms and Models Employedmentioning
confidence: 99%
“…We use the context representations from a 600-d context2vec model pre-trained on the ukWaC corpus (Baroni et al, 2009). 11 For French, Spanish, and Greek, we use BERT models specifically trained for each language: Flaubert (flaubert base uncased) , BETO (bert-base-spanish-wwmuncased) (Cañete et al, 2020), and Greek BERT (bert-base-greek-uncased-v1) (Koutsikakis et al, 2020). We also use the bertbase-multilingual-cased model for each of the four languages.…”
Section: Contextualized Word Representationsmentioning
confidence: 99%