2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineerin 2017
DOI: 10.1109/qir.2017.8168448
|View full text |Cite
|
Sign up to set email alerts
|

Indonesian text feature extraction using gibbs sampling and mean variational inference latent dirichlet allocation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 12 publications
0
7
0
Order By: Relevance
“…Case folding or lowercasing is the process of converting all characters from the clean text dataset into lowercase [10]. This process is done to avoid errors towards identifying a specific term in the dataset [25].…”
Section: Preprocessingmentioning
confidence: 99%
“…Case folding or lowercasing is the process of converting all characters from the clean text dataset into lowercase [10]. This process is done to avoid errors towards identifying a specific term in the dataset [25].…”
Section: Preprocessingmentioning
confidence: 99%
“…In this research, feature extraction is done by topic-based LDA method. LDA has some reasoning algorithms, one of which is Gibbs Sampling that have proven effective in conducting the topic sampling process [28]. In general, in the initialization process, Gibbs Sampling assigns the topic of each term randomly using a multinomial random function.…”
Section: Term Weigthing With Fuzzy Luhn's Gibbs Ldamentioning
confidence: 99%
“…LDA is a generative probabilistic model of a corpus, which documents are represented as random mixtures over latent topics, and each topic is characterized by a distribution over words [25]. A document in a corpus is not only identified as a single topic, but can be identified as several topics with their respective probabilities [26]- [28].…”
Section: Introductionmentioning
confidence: 99%
“…Latent Dirichlet Allocation (LDA) is a topic-based feature extraction method at term, document and corpus level [12]. Gibbs Sampling is one of the reasoning methods used for LDA [9]. LDA requires a sampling process that is carried out repeatedly until it reaches convergent conditions.…”
Section: Fuzzy Gibbs Latent Dirichlet Allocationmentioning
confidence: 99%
“…Wang, Zhou, Jin, Liu, and Lu used four methods: One-Hot Encoding, Term Frequency-Invers Document Frequency (TF-IDF) Weighting, word2vec and paragraph2vec which were applied to the short text classification with Naive Bayes (NB), Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Decision Tree as classifier [4] [7]. Prihatini, Putra, Giriantari, and Sudarma used TF-IDF as feature selection method and Fuzzy Gibbs Latent Dirichlet Allocation as feature extraction method for clustering news digital text, and resulted that the topic model gives better results in performing feature extraction than the classical model because the topic model distributes the term into all topics with different probabilities, while the classical model distributes the term only to one topic [8] [9]. This paper discusses three main factors in improving search engine performance.…”
Section: Introductionmentioning
confidence: 99%