2019
DOI: 10.48550/arxiv.1904.05033
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Better Word Embeddings by Disentangling Contextual n-Gram Information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…A straightforward way of using Word2Vec to represent sentences is by averaging word vectors [12,70]. Furthermore, several Word2Vec extensions optimize and predispose the original method directly for sentence embedding [26,42,16]. A more substantial modification to the original Word2Vec model is Doc2Vec, which introduces paragraph vectors [31].…”
Section: Sentence Embeddingmentioning
confidence: 99%
“…A straightforward way of using Word2Vec to represent sentences is by averaging word vectors [12,70]. Furthermore, several Word2Vec extensions optimize and predispose the original method directly for sentence embedding [26,42,16]. A more substantial modification to the original Word2Vec model is Doc2Vec, which introduces paragraph vectors [31].…”
Section: Sentence Embeddingmentioning
confidence: 99%
“…For a given time window, we create a network whose nodes are concepts (aggregated keywords) present in articles published during that time window. Keywords are aggregated into "concepts" using embeddings from a Sent2Vec model (Gupta, Pagliardini, and Jaggi 2019); the simple average of token-level embeddings is used for keywords composed of multiple tokens. The Sent2Vec model is trained on a corpus of articles similar in domain to the domain that we analyze during the final scoring (e.g.…”
Section: Construction Of Networkmentioning
confidence: 99%