2020
DOI: 10.48550/arxiv.2007.01658
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Playing with Words at the National Library of Sweden -- Making a Swedish BERT

Abstract: This paper introduces the Swedish BERT ("KB-BERT") developed by the KBLab for data-driven research at the National Library of Sweden (KB). Building on recent efforts to create transformer-based BERT models for languages other than English, we explain how we used KB's collections to create and train a new language-specific BERT model for Swedish. We also present the results of our model in comparison with existing models-chiefly that produced by the Swedish Public Employment Service, Arbetsförmedlingen, and Goo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(18 citation statements)
references
References 8 publications
0
10
0
Order By: Relevance
“…In addition, word-embedding-based and transformer-based models are available in a multitude of languages (for Spanish, see Canete et al 2020; for English, see Clark et al 2020; for Swedish, see Malmsten, Börjeson, and Haffenden 2020;and for French, see Martin et al 2019). Even though this article deals with the classification of discrete emotional language in German, it can serve as a framework to create similar tools for other languages which potentially achieve even better performances.…”
Section: Discussionmentioning
confidence: 99%
“…In addition, word-embedding-based and transformer-based models are available in a multitude of languages (for Spanish, see Canete et al 2020; for English, see Clark et al 2020; for Swedish, see Malmsten, Börjeson, and Haffenden 2020;and for French, see Martin et al 2019). Even though this article deals with the classification of discrete emotional language in German, it can serve as a framework to create similar tools for other languages which potentially achieve even better performances.…”
Section: Discussionmentioning
confidence: 99%
“…BERT's architecture is a multi-layer Transformer encoder that is based on the original Transformer architecture introduced by Vaswani et al (2017). We use cased BERT models (TensorFlow versions) through the Huggingface Transformers library (Wolf et al, 2020) with the following language-specific models: the original English BERT, Finnish FinBERT (Virtanen et al, 2019), French FlauBERT (Le et al, 2020) and Swedish KB-BERT (Malmsten et al, 2020). Additionally, we use Multilingual BERT (mBERT) (Devlin et al, 2019), which was pretrained on monolingual Wikipedia corpora from 104 languages with a shared multilingual vocabulary.…”
Section: Methodsmentioning
confidence: 99%
“…The keyword-based labelling system produces annotation labels as outputs, which can be used as supervision signals. To widen the experimental scope, the base Swedish BERT model, KB-BERT (Malmsten et al, 2020), was used on a token level. Thus, an annotation consisting of five tokens resulted in a sequence of five 768 dimensional embeddings.…”
Section: Resultsmentioning
confidence: 99%