2020
DOI: 10.1007/s10115-019-01429-z
|View full text |Cite
|
Sign up to set email alerts
|

A survey of recent methods on deriving topics from Twitter: algorithm to evaluation

Abstract: In recent years, studies related to topic derivation in Twitter have gained a lot of interest from businesses and academics. The interconnection between users and information has made social media, especially Twitter, an ultimate platform for propagation of information about events in real time. Many applications require topic derivation from this social media platform. These include, for example, disaster management, outbreak detection, situation awareness, surveillance, and market analysis. Deriving topics f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0
3

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 34 publications
(26 citation statements)
references
References 122 publications
0
23
0
3
Order By: Relevance
“…As social media posts are often short and may include mis-spelled words or irrelevant characters (such as emojis), social media text documents share an extremely low number of overlapping terms within a collection of posts. To address the sparsity problem, scholars have suggested alternative methods, such as LDA extension to author-topic model, and the dual LDA approach that relies on external knowledge bases like Wikipedia (Atefeh, and Khreich 2015;Nugroho et al 2020;Rafeeque and Sendhilkumar 2011).…”
Section: Latent Dirichlet Allocation (Lda)mentioning
confidence: 99%
See 2 more Smart Citations
“…As social media posts are often short and may include mis-spelled words or irrelevant characters (such as emojis), social media text documents share an extremely low number of overlapping terms within a collection of posts. To address the sparsity problem, scholars have suggested alternative methods, such as LDA extension to author-topic model, and the dual LDA approach that relies on external knowledge bases like Wikipedia (Atefeh, and Khreich 2015;Nugroho et al 2020;Rafeeque and Sendhilkumar 2011).…”
Section: Latent Dirichlet Allocation (Lda)mentioning
confidence: 99%
“…Alternative LDA models or extensions have their own limitations when implemented in specific social media data. For example, Nugroho et al (2020) suggested that standard LDA performs, in fact, better than the author-topic model extension in the social media environment, and dual LDA only increases computational complexity. In addition, using external sources (content expansion) such as Wikipedia may present issues of reliability and uneven quality among documents when used in the dual LDA process, a problem which is amplified when comparing platforms such as Twitter and Weibo.…”
Section: Latent Dirichlet Allocation (Lda)mentioning
confidence: 99%
See 1 more Smart Citation
“…On the one hand, among the specialized tools it is worth mentioning that they tend to use a very technical language and, as a result, it is difficult for non-specialist users to interpret the results. As an example of this type of tools it is worth mentioning Cluto [10] and Weka [11], multi-platform tools that implement a great variety of automatic methods for data analysis. On the other hand, in the second category are tools used to analyze data that are published exclusively on social networks, and that seek to provide users, expert and inexpert, with sufficient elements to perform an easy and intuitive analysis of the results provided by their methods of identification of issues, polarity, etc.…”
Section: Related Studiesmentioning
confidence: 99%
“…There is limited study on the multi-dimensional public opinion polarization phenomenon in a topic-derived context, while most studies conduct research from a single dimension of network public opinion topics and use macro statistics [ 1 , 2 ] or a mathematical modeling method [ 3 , 4 ] to analyze the formation process. In fact, after the outbreak of an event, affected by information with multiple tuples, discussion topics with multiple dimensions tend to be derived [ 5 ]. However, netizens’ debates on topics with different dimensions will enable the polarization of online public opinions to be multi-dimensional.…”
Section: Introductionmentioning
confidence: 99%