2010 IEEE International Conference on Granular Computing 2010
DOI: 10.1109/grc.2010.54
|View full text |Cite
|
Sign up to set email alerts
|

Parallel Simultaneous Co-clustering and Learning with Map-Reduce

Abstract: Many data mining applications involve predictive modeling of very large, complex datasets. Such applications present a need for innovative algorithms and associated implementations that are not only effective in terms of prediction accuracy, but can also be efficiently run on distributed computational systems to yield results in reasonable time. This paper focuses on predictive modeling of multirelational data such as dyadic data with associated covariates or "side-information". We first give illustrative exam… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 16 publications
0
7
0
Order By: Relevance
“…The complexities of data distribution, parallel computation, and resource scheduling are managed by the MapReduce frame-work [3]. In contrast to many previous uses of MapReduce to scale up machine learning that require multiple passes over the data [4]- [9], this approach requires only a single pass (single MapReduce step) to construct the entire ensemble. This minimizes disk I/O and the overhead of setting up and shutting down MapReduce jobs.…”
Section: Introductionmentioning
confidence: 99%
“…The complexities of data distribution, parallel computation, and resource scheduling are managed by the MapReduce frame-work [3]. In contrast to many previous uses of MapReduce to scale up machine learning that require multiple passes over the data [4]- [9], this approach requires only a single pass (single MapReduce step) to construct the entire ensemble. This minimizes disk I/O and the overhead of setting up and shutting down MapReduce jobs.…”
Section: Introductionmentioning
confidence: 99%
“…Each co-cluster fits a prediction model. Deodhar et al 20 also present a parallel version of the SCOAL algorithm. Our paper is inspired by these papers but addresses the fuzziness feature.…”
Section: Parallel Machine-learning Algorithmsmentioning
confidence: 99%
“…In addition, we discuss parallelization of pre-clustering called Canopy clustering [17]. We then cover MapReduce algorithms for hierarchical clustering [22], density-based clustering [11] and co-clustering [9,21].…”
Section: Data Miningmentioning
confidence: 99%