Generalizing Word Embeddings using Bag of Subwords

Zhao, Jinman; Mudgal, Sidharth; Liang, Yingyu

doi:10.18653/v1/d18-1059

Cited by 39 publications

(54 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, these methods (or available authors' codes) do not work on the large-vocabulary settings employed in Sections 6.1 and 6.2. Thus, as an alternative, we strictly followed the experimental settings described in (Zhao et al, 2018) and compared the performance under fair conditions.…”

Section: Comparison With Previous Studiesmentioning

confidence: 99%

Untitled

Sasaki

Suzuki

Inui

2019

Proceedings of the 2019 Conference of the North

View full text Add to dashboard Cite

The idea of subword-based word embeddings has been proposed in the literature, mainly for solving the out-of-vocabulary (OOV) word problem observed in standard word-based word embeddings. In this paper, we propose a method of reconstructing pre-trained word embeddings using subword information that can effectively represent a large number of subword embeddings in a considerably small fixed space. The key techniques of our method are twofold: memory-shared embeddings and a variant of the key-value-query self-attention mechanism. Our experiments show that our reconstructed subword-based embeddings can successfully imitate well-trained word embeddings in a small fixed space while preventing quality degradation across several linguistic benchmark datasets, and can simultaneously predict effective embeddings of OOV words. We also demonstrate the effectiveness of our reconstruction method when we apply them to downstream tasks 1 .

show abstract

Section: Comparison With Previous Studiesmentioning

confidence: 99%

Untitled

Sasaki

Suzuki

Inui

2019

Proceedings of the 2019 Conference of the North

View full text Add to dashboard Cite

show abstract

“…Once we define the feature template, we can extract features of any word, then we can compute the embedding for it. Some recent work (Pinter et al, 2017;Kim et al, 2018;Zhao et al, 2018;Artetxe et al, 2018) address the OOV problem using pre-trained embeddings and mimicking them by training a second model using substrings of a given word. Instead, here we can use arbitrary features and do not need pre-trained embeddings.…”

Section: Word Embeddingsmentioning

confidence: 99%

Beyond Context: A New Perspective for Word Embeddings

Zhou¹,

Srikumar

2019

Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

View full text Add to dashboard Cite

Most word embeddings today are trained by optimizing a language modeling goal of scoring words in their context, modeled as a multiclass classification problem. Despite the successes of this assumption, it is incomplete: in addition to its context, orthographical or morphological aspects of words can offer clues about their meaning. In this paper, we define a new modeling framework for training word embeddings that captures this intuition. Our framework is based on the well-studied problem of multi-label classification and, consequently, exposes several design choices for featurizing words and contexts, loss functions for training and score normalization. Indeed, standard models such as CBOW and FAST-TEXT are specific choices along each of these axes. We show via experiments that by combining feature engineering with embedding learning, our method can outperform CBOW using only 10% of the training data in both the standard word embedding evaluations and also text classification experiments.

show abstract

“…However, this rudimentary approach often detriments the performance of down-Published as a conference paper at ACL 2019 stream tasks which contain numerous rare or unseen words. Recent works have proposed subword approaches (Zhao et al, 2018;Sennrich et al, 2015), which construct embeddings through the composition of characters or sentence pieces for OOV words. Vector space properties are also utilized to learn embeddings with small amounts of data (Bahdanau et al, 2017;Herbelot and Baroni, 2017).…”

Section: Introductionmentioning

confidence: 99%

Embedding Imputation with Grounded Language Information

Yang¹,

Zhu²,

Sachidananda³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Due to the ubiquitous use of embeddings as input representations for a wide range of natural language tasks, imputation of embeddings for rare and unseen words is a critical problem in language processing. Embedding imputation involves learning representations for rare or unseen words during the training of an embedding model, often in a post-hoc manner. In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph. This is in contrast to existing approaches which typically make use of vector space properties or subword information. We propose an online method to construct a graph from grounded information and design an algorithm to map from the resulting graphical structure to the space of the pre-trained embeddings. Finally, we evaluate our approach on a range of rare and unseen word tasks across various domains and show that our model can learn better representations. For example, on the Card-660 task our method improves Pearson's and Spearman's correlation coefficients upon the stateof-the-art by 11% and 17.8% respectively using GloVe embeddings.

show abstract

Generalizing Word Embeddings using Bag of Subwords

Cited by 39 publications

References 13 publications

Untitled

Untitled

Beyond Context: A New Perspective for Word Embeddings

Embedding Imputation with Grounded Language Information

Contact Info

Product

Resources

About