2019
DOI: 10.33557/journalisi.v1i2.18
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of Document Clustering based on Cosine Similarity and K-Main Algorithms

Abstract: Clustering is a useful technique that organizes a large number of non-sequential text documents into a small number of clusters that are meaningful and coherent. Effective and efficient organization of documents is needed, making it easy for intuitive and informative tracking mechanisms. In this paper, we proposed clustering documents using cosine similarity and k-main. The experimental results show that based on the experimental results the accuracy of our method is 84.3%.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 16 publications
(23 reference statements)
0
2
0
1
Order By: Relevance
“…Kombinasi beberapa pendekatan tersebut dilakukan untuk menjawab permasalahan dengan menggabungkan kelebihan dari penelitian terdahulu. Penelitian ini juga menggunakan K-Means untuk pengelompokkan teks tugas akhir (judul+abstrak) yang umumnya memakai cosine similarity seperti penelitian lain serupa namun bertujuan rekomendasi dosen pembimbing tugas akhir [7] atau proses menghitung kemiripan hasil klasterisasi dokumen [8].…”
Section: Pendahuluanunclassified
“…Kombinasi beberapa pendekatan tersebut dilakukan untuk menjawab permasalahan dengan menggabungkan kelebihan dari penelitian terdahulu. Penelitian ini juga menggunakan K-Means untuk pengelompokkan teks tugas akhir (judul+abstrak) yang umumnya memakai cosine similarity seperti penelitian lain serupa namun bertujuan rekomendasi dosen pembimbing tugas akhir [7] atau proses menghitung kemiripan hasil klasterisasi dokumen [8].…”
Section: Pendahuluanunclassified
“…Cosine similarity is another method that uses unsupervised learning techniques like Word2Vec CBoW, Word2Vec Skip-gram, and TF-IDF to determine how similar court documents are to one another [10]. Cosine similarity is employed to group lawful [11]. This research aims to confidently assist lawmakers and legal writers in thoroughly searching for keywords (norms) in UU using Bahasa and implementing them in lawmaking with the aid of a search engine, this approach will significantly save time in understanding each existing law.…”
Section: Introductionmentioning
confidence: 99%
“…The core objectives of author research are twofold. Firstly, author endeavour to elevate the performance of document clustering by incorporating cosine similarity within the framework of the K-Means algorithm [9]. This approach is selected for its effectiveness in capturing semantic similarities among textual data, rendering it particularly apt for the nuanced analysis required in comment clustering.…”
Section: Introductionmentioning
confidence: 99%