2020
DOI: 10.3233/faia200610
|View full text |Cite
|
Sign up to set email alerts
|

LVBERT: Transformer-Based Model for Latvian Language Understanding

Abstract: This paper presents LVBERT – the first publicly available monolingual language model pre-trained for Latvian. We show that LVBERT improves the state-of-the-art for three Latvian NLP tasks including Part-of-Speech tagging, Named Entity Recognition and Universal Dependency parsing. We release LVBERT to facilitate future research and downstream applications for Latvian NLP.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 4 publications
0
7
0
2
Order By: Relevance
“…Skolintų vardų baigmenys, tampantys naujų vardų formantais, patvirtina Lietuvos vardyno atvirumą inovacijoms. kas tika aizstāts ar jomai pielāgotu nosaukto entitāšu atpazinēju (Znotiņš & Bārzdiņš, 2020), tādējādi uzlabojot nosaukto entitāšu atpazīšanas precizitāti. Lai nodrošinātu sekmīgu informācijas izguvi no zinību bāzes, lietotāja izteikumā identificētās entitātes tiek pārveidotas pamatformā, kā arī dialogsistēmā ietverts komponents sinonīmisku entitāšu lietojumu atpazīšanai un normalizēšanai.…”
Section: What Human Genetics Can Give Linguistics a Case Of Region Around Baltics -Germanic Baltic And Finno-ugric Languagesunclassified
“…Skolintų vardų baigmenys, tampantys naujų vardų formantais, patvirtina Lietuvos vardyno atvirumą inovacijoms. kas tika aizstāts ar jomai pielāgotu nosaukto entitāšu atpazinēju (Znotiņš & Bārzdiņš, 2020), tādējādi uzlabojot nosaukto entitāšu atpazīšanas precizitāti. Lai nodrošinātu sekmīgu informācijas izguvi no zinību bāzes, lietotāja izteikumā identificētās entitātes tiek pārveidotas pamatformā, kā arī dialogsistēmā ietverts komponents sinonīmisku entitāšu lietojumu atpazīšanai un normalizēšanai.…”
Section: What Human Genetics Can Give Linguistics a Case Of Region Around Baltics -Germanic Baltic And Finno-ugric Languagesunclassified
“…The neural network model is based on Latvian BERT word embeddings (Znotins and Barzdins, 2020). To support entity classes of a particular domain, the NER model is trained on a larger generaldomain dataset (Gruzitis et al, 2018;Paikens et al, 2020) and a smaller domain-specific dataset.…”
Section: Entity Databasementioning
confidence: 99%
“…The tool has been successfully used for the creation of a dialogue dataset for the virtual assistant that supports the work of the student service in relation to three frequently asked topics: working hours and contacts of the personnel and structural units (e.g. libraries), issues regarding academic leave, as well as enrollment requirements and documents to be submitted (Skadina and Gosko, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…The LVBERT model [5] yields slightly worse results for POS tagging (98.1% accuracy) and NER (82.6 % accuracy) tasks than the best embeddings in [4]. Likewise, the Latvian BERT model in [6]applied to NER task achieved an F1 score of Applied Computer Systems _________________________________________________________________________________________________2021/26 133 81.91 % while using a significantly smaller training corpus than in [4].…”
Section: Related Workmentioning
confidence: 99%