2022
DOI: 10.1007/978-3-030-88389-8_16
|View full text |Cite
|
Sign up to set email alerts
|

Text Representations and Word Embeddings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 19 publications
(13 citation statements)
references
References 43 publications
0
13
0
Order By: Relevance
“…Top2Vec (Angelov, 2020) is a comparatively new algorithm that uses word embeddings. That is, the vectorization of text data makes it possible to locate semantically similar words, sentences, or documents within spatial proximity (Egger, 2022a). For example, words like "mom" and "dad" should be closer than words like "mom" and "apple."…”
Section: Model 3: Top2vecmentioning
confidence: 99%
“…Top2Vec (Angelov, 2020) is a comparatively new algorithm that uses word embeddings. That is, the vectorization of text data makes it possible to locate semantically similar words, sentences, or documents within spatial proximity (Egger, 2022a). For example, words like "mom" and "dad" should be closer than words like "mom" and "apple."…”
Section: Model 3: Top2vecmentioning
confidence: 99%
“…Similar to embedding-based topic modeling approaches (Egger & Yu, 2022), topological data analysis involves representing lists of topics in a vector space. For this purpose, the preprocessed text is converted into numerical representations (Egger, 2022b(Egger, , 2022c.…”
Section: Methodsmentioning
confidence: 99%
“…As Camastra and Vinciarelli mention [ 35 ], using more features than is strictly necessary leads to several problems, pointing out that one of the main problems was the space needed to store the data. As the amount of available information increases, the compression for storage becomes even more critical [ 12 , 36 , 37 ]. Additionally, for the scope of this work, it cannot be ignored that the application of dimensional reduction techniques for reducing pre-computed embedding dimensions neither improves the runtime nor the memory requirement for running the models.…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, in the literature, dimension reduction research on embeddings has focused on statistical methods, such as Bag of Words and Term Frequency-Inverse Document Frequency (TF-IDF) [ 27 , 37 ], and classical pre-computed word embeddings, including the popular GloVe or FastText embeddings [ 21 – 24 , 36 , 49 ]. These classical word embeddings are more complex and powerful than statistical methods.…”
Section: Related Workmentioning
confidence: 99%