Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1199
|View full text |Cite
|
Sign up to set email alerts
|

Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones

Abstract: Syllabification does not seem to improve word-level RNN language modeling quality when compared to characterbased segmentation. However, our best syllable-aware language model, achieving performance comparable to the competitive character-aware model, has 18%-33% fewer parameters and is trained 1.2-2.2 times faster.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
25
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(27 citation statements)
references
References 18 publications
2
25
0
Order By: Relevance
“…However, external fragmentation of words into morphemes propagate errors into the models, affecting the quality of the word embeddings [29]. Our work is similar to those of Assylbekov et al, Yu et al and Mikolov et al [9,21,30] on the basis of learning syllable and word representation. However, we utilized a defined syllabic alphabet instead of an external hyphenation algorithm to divide the words into syllables, which we hypothesize that may introduce errors.…”
Section: Introductionsupporting
confidence: 65%
See 3 more Smart Citations
“…However, external fragmentation of words into morphemes propagate errors into the models, affecting the quality of the word embeddings [29]. Our work is similar to those of Assylbekov et al, Yu et al and Mikolov et al [9,21,30] on the basis of learning syllable and word representation. However, we utilized a defined syllabic alphabet instead of an external hyphenation algorithm to divide the words into syllables, which we hypothesize that may introduce errors.…”
Section: Introductionsupporting
confidence: 65%
“…This is motivated by the CNN's ability to extract high quality features, leading to CNN models posting significant results in sentiment analysis [49], parsing [50], search query retrieval [51] and part-of-speech tagging [52]. The recent trend is to combine the strengths of the CNN and the RNN to design superior models for NLP [21,44,49,53].…”
Section: Deep Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…• CharCNN (Kim et al, 2016) is a characteraware convolutional model, which performs on par with the 2014-2015 state-of-the-art wordlevel LSTM model (Zaremba et al, 2014) despite having 60% fewer parameters. • SylConcat is a simple concatenation of syllable embeddings suggested by Assylbekov et al (2017), which underperforms CharCNN but has fewer parameters and is trained faster. • MorphSum is a summation of morpheme embeddings, which is similar to the approach of Botha and Blunsom (2014) with one important difference: the embedding of the word itself is not included into the sum.…”
Section: Data Setmentioning
confidence: 99%