2000
DOI: 10.1007/3-540-45591-4_51
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Parallel Clustering for Data Mining on Multicomputers

Abstract: Abstract. This paper describes the design and implementation on MIMD parallel machines of P-AutoClass, a parallel version of the AutoClass system based upon the Bayesian method for determining optimal classes in large datasets. The P-AutoClass implementation divides the clustering task among the processors of a multicomputer so that they work on their own partition and exchange their intermediate results. The system architecture, its implementation and experimental performance results on different processor nu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2001
2001
2010
2010

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 31 publications
(19 citation statements)
references
References 6 publications
0
19
0
Order By: Relevance
“…Surprisingly our results show that by using the concept of clustering and parallelism, search becomes more cost effective, time effective and the quality of the search becomes accurate. Our results show that this strategy is able to cause efficient performance both in large scale and small scale search engines [16].…”
Section: Inputmentioning
confidence: 76%
See 1 more Smart Citation
“…Surprisingly our results show that by using the concept of clustering and parallelism, search becomes more cost effective, time effective and the quality of the search becomes accurate. Our results show that this strategy is able to cause efficient performance both in large scale and small scale search engines [16].…”
Section: Inputmentioning
confidence: 76%
“…developed scalable parallel clustering models for data mining on multi-computers in their research paper in [16]. They designed & implemented on MIMD parallel machines of PAutoClass, a parallel version of the AutoClass system based upon the Bayesian method for determining optimal classes in large datasets.…”
Section: Introductionmentioning
confidence: 99%
“…An example of parallel implementation of a clustering algorithm is P-CLUSTER [10]. Other parallel clustering algorithms are discussed in [5], [12], and [7]. In particular, in [7] an SPDM implementation of the AutoClass algorithm, named P-AutoClass is described.…”
Section: Parallel Cluster Analysismentioning
confidence: 99%
“…Other parallel clustering algorithms are discussed in [5], [12], and [7]. In particular, in [7] an SPDM implementation of the AutoClass algorithm, named P-AutoClass is described. The paper shows interesting performance results on distributed memory MIMD machines.…”
Section: Parallel Cluster Analysismentioning
confidence: 99%
“…Nevertheless, it is easy to see how distributed formulation of related hill-climbing algorithms such as k-means and k-median clustering [5,7,6] can be adapted to solve distributed FLP. We note, however, that all previous work on distributed clustering assumes tight cooperation and synchronization between the processors containing the data, and a central processor that collects the sufficient statistics needed in each step of the hill-climbing heuristic.…”
Section: Introductionmentioning
confidence: 99%