2021
DOI: 10.31158/jeev.2021.34.1.1
|View full text |Cite
|
Sign up to set email alerts
|

Exploring methods for determining the appropriate number of topics in LDA: Focusing on perplexity and harmonic mean method

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 0 publications
1
3
0
Order By: Relevance
“…The news agendas change the attitudes of information recipients [ 35 ], determining the level of public trust in the company [ 18 ] and the direction of the business management [ 26 , 45 , 46 ]. The study findings support prior research that used LDA topic modeling analysis with media report data [ 5 , 52 , 57 , 58 , 144 ]. It also confirmed that the importance of issues in the fashion industry’s major agendas could be highlighted through media reports.…”
Section: Discussionsupporting
confidence: 87%
See 2 more Smart Citations
“…The news agendas change the attitudes of information recipients [ 35 ], determining the level of public trust in the company [ 18 ] and the direction of the business management [ 26 , 45 , 46 ]. The study findings support prior research that used LDA topic modeling analysis with media report data [ 5 , 52 , 57 , 58 , 144 ]. It also confirmed that the importance of issues in the fashion industry’s major agendas could be highlighted through media reports.…”
Section: Discussionsupporting
confidence: 87%
“…Topic modeling is an unsupervised learning algorithm that gathers keywords and documents with similar topics based on words in large text data, if the major topics have a probabilistic distribution [ 51 ]. LDA topic modeling uses a probabilistic algorithm suitable for analyzing large quantities of digital data in science research [ 52 ]. Specifically, assuming a Dirichlet distribution in which a topic hierarchy exists between words appearing in a document, the distribution of words in each document is probabilistically calculated to create a model.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…If there are very few topics, the redundancy may decrease and they are easy to interpret, but it is difficult to derive various keywords (Greene et al, 2014). Accordingly, previous studies (Park and Lee, 2019;Kwon and Kim, 2021;Lee and Yi, 2021;Park, 2021;Park et al, 2022) are determining the number of topics through perplexity, topic coherence, silhouette coefficient, and expert decision making. This study finally decided on the number of topics considering the silhouette coefficient inherent in NetMiner 4.0 and the results of previous studies.…”
Section: Text Miningmentioning
confidence: 99%