Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 2019
DOI: 10.24963/ijcai.2019/749
|View full text |Cite
|
Sign up to set email alerts
|

Refining Word Representations by Manifold Learning

Abstract: Pre-trained distributed word representations have been proven useful in various natural language processing (NLP) tasks. However, the effect of words’ geometric structure on word representations has not been carefully studied yet. The existing word representations methods underestimate the words whose distances are close in the Euclidean space, while overestimating words with a much greater distance. In this paper, we propose a word vector refinement model to correct the pre-trained word embedding, which bring… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 1 publication
0
9
0
Order By: Relevance
“…Accordingly, the average random cosine of the left point cloud in Figure 2 approaches 1 while the average random cosine similarity of the right point cloud approaches 0. It is well known that word embedding models have non-zero mean vectors (Yonghe et al, 2019;Liang et al, 2021). In the case of GPT-2 embeddings obtained from the WikiText-2 corpus (Merity et al, 2016), we find values in the mean vector range from −32.36 to 198.19.…”
Section: Word Embeddingsmentioning
confidence: 63%
“…Accordingly, the average random cosine of the left point cloud in Figure 2 approaches 1 while the average random cosine similarity of the right point cloud approaches 0. It is well known that word embedding models have non-zero mean vectors (Yonghe et al, 2019;Liang et al, 2021). In the case of GPT-2 embeddings obtained from the WikiText-2 corpus (Merity et al, 2016), we find values in the mean vector range from −32.36 to 198.19.…”
Section: Word Embeddingsmentioning
confidence: 63%
“…In addition, some embeddings refinement methods were proposed to generate better semantic word representations. For instance, a word vector refinement model to correct the pre-trained word embedding by using manifold learning was proposed to bring the similarity of words in the Euclidean space closer to word semantics [33]. To learn sentiment word embeddings, a sentiment embedding refinement model based on adjusting the representations of words was used in sentiment analysis to make the word embeddings closer to words which are semantically similar and bear the same polarities and further away from words with opposing polarities [31].…”
Section: Sentiment Embeddingsmentioning
confidence: 99%
“…In sentiment analysis, word embeddings learned by neural models can capture certain types of sentiment features from input texts [21], [22], [23], [24], [25], [26], [27]. Moreover, many embedding refinement methods have been proposed which refine the pre-trained word embeddings based on external knowledge for a better modeling of sentiment information [17], [28], [29], [30], [31], [32], [33]. However, these methods only focus on learning sentiment embeddings, without considering the contextual relations between targets and aspects in the task of TABSA.…”
Section: Introductionmentioning
confidence: 99%
“…In contrast, the SBERT-LP analyzes and solve the cosine metric problem of the SBERT sentence space on a manifold. Our work is inspired by the investigation of local geometry in the word space (Hasan and Curry, 2017;Yonghe et al, 2019). These methods solve semantic problems in word space.…”
Section: Related Workmentioning
confidence: 99%