2020
DOI: 10.17977/um018v3i22020p106-111
|View full text |Cite
|
Sign up to set email alerts
|

Generating Javanese Stopwords List using K-means Clustering Algorithm

Abstract: Stopword removal necessary in Information Retrieval. It can remove frequently appeared and general words to reduce memory storage. The algorithm eliminates each word that is precisely the same as the word in the stopword list. However, generating the list could be time-consuming. The words in a specific language and domain must be collected and validated by specialists. This research aims to develop a new way to generate a stop word list using the K-means Clustering method. The proposed approach groups words b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 15 publications
0
6
0
2
Order By: Relevance
“…Minimize the loss caused by reconstruction to obtain the optimal parameters *  and * '  as equation (4). The loss function used in this paper is the Kullback-Leibler divergence as equation (5).…”
Section: Basic Autoencodermentioning
confidence: 99%
See 1 more Smart Citation
“…Minimize the loss caused by reconstruction to obtain the optimal parameters *  and * '  as equation (4). The loss function used in this paper is the Kullback-Leibler divergence as equation (5).…”
Section: Basic Autoencodermentioning
confidence: 99%
“…This can avoid over-learning insignificant features in short texts. We adjust formula (5) to formula (7) and (8) to calculate.…”
Section: L1 Normal Form Regularizationmentioning
confidence: 99%
“…Analisis hasil klasifikasi dilakukan dengan menggunakan pendekatan matriks konfusi [23]. Matriks Konfusi merupakan tabel yang digunakan untuk menunjukkan keefektifan hasil keluaran dari suatu algoritma.…”
Section: Analisis Hasilunclassified
“…The fully connected layer process connects all the results of the neurons in the previous process to the next layer of neurons so that images can be classified. The confusion matrix approach was used to test the classification results [20]. The Confusion Matrix is a table that demonstrates how effective an algorithm's output results are.…”
Section: Scenariomentioning
confidence: 99%