2007 IEEE International Parallel and Distributed Processing Symposium 2007
DOI: 10.1109/ipdps.2007.370250
|View full text |Cite
|
Sign up to set email alerts
|

Towards A Better Understanding of Workload Dynamics on Data-Intensive Clusters and Grids

Abstract: This paper presents a comprehensive statistical analysis of workloads collected on data-intensive clusters and Grids. The analysis is conducted at different levels, including Virtual Organization (VO) and user behavior. The aggregation procedure and scaling analysis are applied to job arrival processes, leading to the identification of several basic patterns, namely, pseudo-periodicity, long range dependence (LRD), and (multi)fractals. It is shown that statistical measures based on interarrivals are of limited… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2007
2007
2012
2012

Publication Types

Select...
4
3

Relationship

5
2

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 24 publications
0
11
0
Order By: Relevance
“…However, it is not as flexible as models in case that many traces have to be generated to enable a Grid scheduling study. The traces available from parallel workloads can also have significantly different characteristics compared to Grid workloads, which has been empirically proved [12]. Such differences, in turn, may lead to considerably different performance evaluation results.…”
Section: Evaluation Of Scheduling Algorithmsmentioning
confidence: 98%
See 3 more Smart Citations
“…However, it is not as flexible as models in case that many traces have to be generated to enable a Grid scheduling study. The traces available from parallel workloads can also have significantly different characteristics compared to Grid workloads, which has been empirically proved [12]. Such differences, in turn, may lead to considerably different performance evaluation results.…”
Section: Evaluation Of Scheduling Algorithmsmentioning
confidence: 98%
“…These traces or models, however, exhibit significantly different characteristics than the traces on production Grids. As has been studied and reported in [12], pseudo-periodicity, long range dependence (LRD), and "bag-of-tasks" behavior with strong temporal locality are the main properties that characterize production Grid workloads. Therefore, it is important that representative models be developed to capture the salient properties of Grid workloads.…”
Section: Evaluation Of Scheduling Algorithmsmentioning
confidence: 99%
See 2 more Smart Citations
“…To overcome this difficulty, many researchers decide to use randomly generated workloads in their work. These workloads are usually unrealistic because several statistical studies [7], [8], [13], [14], [15], [31] have shown that the characteristics of parallel system workloads 1 are far from independently and identically distributed. Instead, they have several important and correlated characteristics such as long range dependence, burstiness, and bag-of-tasks (BoT) behaviour.…”
Section: Introductionmentioning
confidence: 99%