2017
DOI: 10.1007/978-3-319-57529-2_29
|View full text |Cite
|
Sign up to set email alerts
|

Topic Modeling over Short Texts by Incorporating Word Embeddings

Abstract: Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
49
1
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 79 publications
(52 citation statements)
references
References 26 publications
1
49
1
1
Order By: Relevance
“…Data are collected from various media like: twitter, question and answers collections, web searches, news groups, reuters, various scientific documents, public data from different sites and so on. These data are pre-processed with various techniques to remove unwanted information in unstructured data [22] [23] [24].…”
Section: Data Acquisitionmentioning
confidence: 99%
See 3 more Smart Citations
“…Data are collected from various media like: twitter, question and answers collections, web searches, news groups, reuters, various scientific documents, public data from different sites and so on. These data are pre-processed with various techniques to remove unwanted information in unstructured data [22] [23] [24].…”
Section: Data Acquisitionmentioning
confidence: 99%
“…For three authors, you may have to improvise. Frequency of Cooccurrence Words 7 [5], [27], [31][32][33][34][35][36] Pseudo documents 6 [22], [23], [36][37][38][39] Word weighting 6 [26], [40][41][42][43][44] Word Embedding 15 [31], [22], [26], [28][ [45][46][47][48][49][50][51][52][53][54][55][56] Sentence Level 1 [57] Hash tags 5 [23], [43], [46], [58], [59]…”
Section: Title and Authorsmentioning
confidence: 99%
See 2 more Smart Citations
“…Continuous word embeddings are known for capturing semantic regularities of words (Mikolov et al, 2013a;Collobert and Weston, 2008). Some works have made use of this fact to improve the resulting topics (Das et al, 2015;Nguyen et al, 2015;Qiang et al, 2016), but their objective is to improve the unsupervised modelling of a corpus instead of guiding the model towards a predefined set of topics. There are works that exploit word embeddings in a supervised machine learning setting to perform sentiment analysis (Tang et al, 2014;Giatsoglou et al, 2017).…”
Section: Related Workmentioning
confidence: 99%