Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1059
|View full text |Cite
|
Sign up to set email alerts
|

Generalizing Word Embeddings using Bag of Subwords

Abstract: We approach the problem of generalizing pretrained word embeddings beyond fixed-size vocabularies without using additional contextual information. We propose a subwordlevel word vector generation model that views words as bags of character n-grams. The model is simple, fast to train and provides good vectors for rare or unseen words. Experiments show that our model achieves stateof-the-art performances in English word similarity task and in joint prediction of part-ofspeech tag and morphosyntactic attributes i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
53
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 39 publications
(54 citation statements)
references
References 13 publications
0
53
1
Order By: Relevance
“…However, these methods (or available authors' codes) do not work on the large-vocabulary settings employed in Sections 6.1 and 6.2. Thus, as an alternative, we strictly followed the experimental settings described in (Zhao et al, 2018) and compared the performance under fair conditions.…”
Section: Comparison With Previous Studiesmentioning
confidence: 99%
“…However, these methods (or available authors' codes) do not work on the large-vocabulary settings employed in Sections 6.1 and 6.2. Thus, as an alternative, we strictly followed the experimental settings described in (Zhao et al, 2018) and compared the performance under fair conditions.…”
Section: Comparison With Previous Studiesmentioning
confidence: 99%
“…Once we define the feature template, we can extract features of any word, then we can compute the embedding for it. Some recent work (Pinter et al, 2017;Kim et al, 2018;Zhao et al, 2018;Artetxe et al, 2018) address the OOV problem using pre-trained embeddings and mimicking them by training a second model using substrings of a given word. Instead, here we can use arbitrary features and do not need pre-trained embeddings.…”
Section: Word Embeddingsmentioning
confidence: 99%
“…However, this rudimentary approach often detriments the performance of down-Published as a conference paper at ACL 2019 stream tasks which contain numerous rare or unseen words. Recent works have proposed subword approaches (Zhao et al, 2018;Sennrich et al, 2015), which construct embeddings through the composition of characters or sentence pieces for OOV words. Vector space properties are also utilized to learn embeddings with small amounts of data (Bahdanau et al, 2017;Herbelot and Baroni, 2017).…”
Section: Introductionmentioning
confidence: 99%