2014 International Conference on Computational Intelligence and Communication Networks 2014
DOI: 10.1109/cicn.2014.123
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Algorithms for Document Clustering

Abstract: Clustering is "the method of organizing objects into groups whose members are related in some way". A cluster is therefore a collection of objects which are coherent internally, but clearly dissimilar to the objects belonging to other clusters. Document clustering is used in many fields such as data mining and information retrieval. Thus, the main goals of this paper are to identify the comparison of the performance of criterion function in the context of partition clustering approach, k means, and agglomerati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 7 publications
0
3
0
Order By: Relevance
“…We explored different nonlinear approaches including manifold learning, kernel PCA (KPCA), isometric mapping (IsoMap), locally linear embedding (LLE), multidimensional scaling (MDS), and uniform manifold approximation and projection (UMAP) on the word embeddings. 53), Ward linkage in Shehata (54), and BIRCH in Gupta and Rajavat (55). With the exception of UMAP and spherical k-means, all the dimensionality reduction and clustering algorithms mentioned above have been implemented in Python using the Scikit-learn library (56).…”
Section: Dimensionality Reduction and Clusteringmentioning
confidence: 99%
“…We explored different nonlinear approaches including manifold learning, kernel PCA (KPCA), isometric mapping (IsoMap), locally linear embedding (LLE), multidimensional scaling (MDS), and uniform manifold approximation and projection (UMAP) on the word embeddings. 53), Ward linkage in Shehata (54), and BIRCH in Gupta and Rajavat (55). With the exception of UMAP and spherical k-means, all the dimensionality reduction and clustering algorithms mentioned above have been implemented in Python using the Scikit-learn library (56).…”
Section: Dimensionality Reduction and Clusteringmentioning
confidence: 99%
“…It is not sensitive with respect to noise, but it takes more time compared to K-means. [30] 3)CURE: Clustering Using Representatives is also a hierarchical based clustering technique. It selects well-scattered points from the cluster and then reduces them in the center of the cluster by a specified fraction.…”
Section: Text Clusteringmentioning
confidence: 99%
“…In the current digital world, huge quantity of data is stored in variety of forms like image, text, audio, video and this may increase in the future [1]. The increase of digital information increases the demand of tools for analysis and discovers useful information.…”
Section: Introductionmentioning
confidence: 99%