2021
DOI: 10.1016/j.knosys.2020.106611
|View full text |Cite
|
Sign up to set email alerts
|

Decomposing word embedding with the capsule network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…Pre-trained sentence embedding or word embedding encode the model input into the embedding vector. The experiments are performed with RoBERTa [54] models, XLNET [73], Albert [74], DistilBERT [72], and BilSTM [71] using word2vec (w2v) [85], Sigmoid [86], GLU [87], and global vector (GloVe) [88] embedding. A sentence encoder builds a 768-dimensional hidden layer for individual phrase representation.…”
Section: Hyper-parametersmentioning
confidence: 99%
“…Pre-trained sentence embedding or word embedding encode the model input into the embedding vector. The experiments are performed with RoBERTa [54] models, XLNET [73], Albert [74], DistilBERT [72], and BilSTM [71] using word2vec (w2v) [85], Sigmoid [86], GLU [87], and global vector (GloVe) [88] embedding. A sentence encoder builds a 768-dimensional hidden layer for individual phrase representation.…”
Section: Hyper-parametersmentioning
confidence: 99%
“…However, that does not properly discriminate different users, contrary to the fact that different users in a tweet propagation network may contribute differently to classifying the tweet. In our UMLARD, we introduce a capsule attention layer inspired by the recent success of capsule networks [62][63][64]. The Capsule network was first proposed in [62] and the main idea is to replace the scalar-output feature detectors in traditional neural networks with vector-output capsules, and train the model by the dynamic routing algorithm.…”
Section: Capsule Attention For User-level Feature Fusionmentioning
confidence: 99%
“…The method leverages contextual embeddings, glosses, and semantic networks to achieve the full coverage. A more recent system [26] uses BERT to learn context embedding and the capsule network to decompose word embedding into multiple morpheme-like vectors.…”
Section: Related Workmentioning
confidence: 99%
“…It shows the ability of contextual representation and effectiveness of incorporating gloss knowledge. CapsDecE2S [26] utilizes capsule network to decompose the unsupervised word embedding into multiple morpheme-like vectors and merges them by contextual attention to generate context specific sense embedding. The CapsDecE2S and GlossBERT enable two strong baselines hard to beat.…”
mentioning
confidence: 99%