Accelerating Fuzzy-C Means Using an Estimated Subsample Size

Parker, Jonathon K.; Hall, Lawrence O.

doi:10.1109/tfuzz.2013.2286993

Cited by 80 publications

(28 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[29][30][31][32] The advantages of FCM over k-means or any other clustering algorithm is that an object can lie under any number of clusters with or without overlap and its intent of belongingness can be estimated according to the following constraints with n, and c being number of data samples and clusters, respectively: [29][30][31][32] The advantages of FCM over k-means or any other clustering algorithm is that an object can lie under any number of clusters with or without overlap and its intent of belongingness can be estimated according to the following constraints with n, and c being number of data samples and clusters, respectively:…”

Section: Fuzzy C Meansmentioning

confidence: 99%

“…FCM is one among the popular clustering algorithm, which samples the data without following the statistical approach for determining the size of the subsample. [29][30][31][32] The advantages of FCM over k-means or any other clustering algorithm is that an object can lie under any number of clusters with or without overlap and its intent of belongingness can be estimated according to the following constraints with n, and c being number of data samples and clusters, respectively:…”

Section: Fuzzy C Meansmentioning

confidence: 99%

See 1 more Smart Citation

Performance enhancement of swarm intelligence techniques in dementia classification using dragonfly‐based hybrid algorithms

Bharanidharan¹,

Rajaguru²

2019

Int J Imaging Syst Tech

View full text Add to dashboard Cite

Most often clinicians require automated computer‐aided MRI classification techniques to substantiate the status of dementia accurately. In this research paper, dragonfly‐based features are used to improve the accuracy of well‐known swarm intelligence algorithms specifically particle swarm optimization, artificial bee colony, and ant colony optimization in dementia classification. Cross‐sectional MRI of 65 non‐dementia and 52 dementia subjects were collected from the OASIS database and analyzed. The dementia classification performance of above‐mentioned three individual swarm intelligence algorithms is compared with non‐swarm intelligence algorithm—Fuzzy C means. A further comparison was made on the improvisation of above‐mentioned swarm intelligence algorithms while using dragonfly‐based features and Fuzzy C means‐based features. Although many swarm intelligence algorithms are reported in the literature, it is ingenious to use dragonfly‐based features for improving the performance of individual swarm intelligence algorithms in dementia classification. With proper weight parameters, Dragonfly‐particle swarm optimization hybrid classifier yields the highest accuracy of 87.18%, whereas all the above‐mentioned individual classifiers yield accuracy of less than 66%.

show abstract

Section: Fuzzy C Meansmentioning

confidence: 99%

Section: Fuzzy C Meansmentioning

confidence: 99%

Performance enhancement of swarm intelligence techniques in dementia classification using dragonfly‐based hybrid algorithms

Bharanidharan¹,

Rajaguru²

2019

Int J Imaging Syst Tech

View full text Add to dashboard Cite

show abstract

“…Since FCM choose the initial centers randomly, the final result and especially its convergence speed significantly depends on the original center selection. A method proposed to address this problem is based on estimated subsample size to improve the initialization [18]. In the field of clustering large amounts of data, three types of methods have been proposed:…”

Section: Related Workmentioning

confidence: 99%

BigFCM: Fast, precise and scalable FCM on hadoop

Ghadiri

Ghaffari

Nikbakht

2017

Future Generation Computer Systems

View full text Add to dashboard Cite

Abstract:Clustering plays an important role in mining big data both as a modeling technique and a preprocessing step in many data mining process implementations. Fuzzy clustering provides more flexibility than non-fuzzy methods by allowing each data record to belong to more than one cluster to some degree. However, a serious challenge in fuzzy clustering is the lack of scalability. Massive datasets in emerging fields such as geosciences, biology and networking do require parallel and distributed computations with high performance to solve real-world problems. Although some clustering methods are already improved to execute on big data platforms, but their execution time is highly increased for large datasets. In this paper, a scalable Fuzzy C-Means (FCM) clustering named BigFCM is proposed and designed for the Hadoop distributed data platform. Based on the map-reduce programming model, it exploits several mechanisms including an efficient caching design to achieve several orders of magnitude reduction in execution time. Extensive evaluation over multi-gigabyte datasets shows that BigFCM is scalable while it preserves the quality of clustering.

show abstract

“…First is the distance measure strategy and second is initial centroids selection strategy to minimize processing speed and increase stability. Paper [4] introduces two accelerated clustering algorithms using estimated subsample size and the novel stopping criterion. Authors in the paper [5] present a systematic study of kmeans-based consensus clustering algorithm, identify necessary and sufficient conditions for the algorithms on both pure and noisy datasets.…”

Section: Related Workmentioning

confidence: 99%

“…The authors in the paper [12] propose two novel enhanced algorithms such as geometric progressive fuzzy c-means and minimum sample estimate random fuzzy c-means by using some statistical techniques. This is to compute the size subsamples.…”

Section: Related Workmentioning

confidence: 99%

Performance Analysis of Improved Clustering Algorithm on Real and Synthetic Data

Khandare¹,

Alvi²

2017

IJCNIS

View full text Add to dashboard Cite

Abstract-Clustering is an important technique in data mining to partition the data objects into clusters. It is a way to generate groups from the data objects. Different data clustering methods or algorithms are discussed in the various literature. Some of these are efficient while some are inefficient for large data. The k-means, Partition Around Method (PAM) or k-medoids, hierarchical and DBSCAN are various clustering algorithms. The k-means algorithm is more popular than the other algorithms used to partition data into k clusters. For this algorithm, k should be provided explicitly. Also, initial means are taken randomly but this may generate clusters with poor quality. This paper is a study and implementation of an improved clustering algorithm which automatically predicts the value of k and uses a new technique to take initial means. The performance analysis of the improved algorithm and other algorithms by using real and dummy datasets is presented in this paper. To measure the performance of algorithms, this paper uses running time of algorithms and various cluster validity measures. Cluster validity measures include sum squared error, silhouette score, compactness, separation, Dunn index and DB index. Also, the k predicted by the improved algorithm is compared with optimal k suggested by elbow method. It is found that both values of k are almost similar. Most of the values of validity measures for the improved algorithm are found to be optimal.

show abstract

Accelerating Fuzzy-C Means Using an Estimated Subsample Size

Cited by 80 publications

References 47 publications

Performance enhancement of swarm intelligence techniques in dementia classification using dragonfly‐based hybrid algorithms

Performance enhancement of swarm intelligence techniques in dementia classification using dragonfly‐based hybrid algorithms

BigFCM: Fast, precise and scalable FCM on hadoop

Performance Analysis of Improved Clustering Algorithm on Real and Synthetic Data

Contact Info

Product

Resources

About