2020
DOI: 10.48550/arxiv.2005.12515
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ParsBERT: Transformer-based Model for Persian Language Understanding

Mehrdad Farahani,
Mohammad Gharachorloo,
Marzieh Farahani
et al.

Abstract: The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its sta… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 17 publications
0
8
0
Order By: Relevance
“…This model achieved an F1-score over 83%. Farahani et al [21] investigated BERT's performance in NLP for Persian due to its success and growing popularity in English. By training the BERT model for Persian language, they showed that it outperformed previous studies at operations such as text classification and sentiment analysis.…”
Section: Bert Based Techniquesmentioning
confidence: 99%
“…This model achieved an F1-score over 83%. Farahani et al [21] investigated BERT's performance in NLP for Persian due to its success and growing popularity in English. By training the BERT model for Persian language, they showed that it outperformed previous studies at operations such as text classification and sentiment analysis.…”
Section: Bert Based Techniquesmentioning
confidence: 99%
“…ParsBERT [7] is a monolingual BERT model trained by Hooshvare Lab. 2 It is trained on monolingual corpus of Persian text and has achieved state-of-the-art results by outperforming previous mBERT-based and recurrent neural network-based models in several benchmarks.…”
Section: Parsbertmentioning
confidence: 99%
“…In addition, we train our models using the ParsBERT model [7], which is a monolingual BERT model for the Persian language that has outperformed other architectures and multilingual BERT models. We also include earlier sequence taggers, for example, bidirectional long short-term memory (BiLSTM) [8] and CRF to our pretrained models to provide a comprehensive comparative study of the impacts of text representation and learning models in NER.…”
mentioning
confidence: 99%
“…Multilingual BERT (mBERT) is pretrained on the masked LM task over 104 languages. Additionally, we use two specialized variants of BERT for Persian: wikiBERT 20 which is trained on Persian Wikipedia and ParsBERT (Farahani et al, 2020). 21 Finally, we use mT5 (Xue et al, 2020), which is a multilingual variant of the T5 architec-ture.…”
Section: Machine Translationmentioning
confidence: 99%