2018
DOI: 10.1162/tacl_a_00032
|View full text |Cite
|
Sign up to set email alerts
|

Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction

Abstract: Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models shoul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
57
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 40 publications
(57 citation statements)
references
References 26 publications
(42 reference statements)
0
57
0
Order By: Relevance
“…Cotterell et al (2018) study 21 languages. Gerz et al (2018) create datasets for 50 languages. All of these studies, however, only create small datasets, which are inadequate for pretraining language models.…”
Section: Cross-lingual Pretrained Language Modelsmentioning
confidence: 99%
“…Cotterell et al (2018) study 21 languages. Gerz et al (2018) create datasets for 50 languages. All of these studies, however, only create small datasets, which are inadequate for pretraining language models.…”
Section: Cross-lingual Pretrained Language Modelsmentioning
confidence: 99%
“…The first direction aims to obtain good embeddings for novel words by looking at their characters (Pinter, Guthrie, and Eisenstein 2017), morphemes (Lazaridou et al 2013;Luong, Socher, and Manning 2013;Cotterell, Schütze, and Eisner 2016) or n-grams (Wieting et al 2016;Bojanowski et al 2017;Ataman and Federico 2018;Salle and Villavicencio 2018). Naturally, this direction is especially well-suited for languages with rich morphology (Gerz et al 2018). The second, context-based direction tries to infer embeddings for novel words from the words surrounding them (Lazaridou, Marelli, and Baroni 2017;Herbelot and Baroni 2017;Khodak et al 2018).…”
Section: Introductionmentioning
confidence: 99%
“…All benchmarked -gram LMs are 5-grams, with the exception of BKN which is an ∞-gram model trained via 5 samples 4 following the recipe of . GKN (Gerz et al, 2018a). Suit symbols denote morphological types: ♢ Isolating, ♡ Fusional, ♠ Agglutinative, ♣ Introflexive.…”
Section: Experiments and Resultsmentioning
confidence: 99%