Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers 2009
DOI: 10.1145/1646468.1646472
|View full text |Cite
|
Sign up to set email alerts
|

A data throughput prediction and optimization service for widely distributed many-task computing

Abstract: In this paper, we present the design and implementation of a network throughput prediction and optimization service for many-task computing in widely distributed environments. This service uses multiple parallel TCP streams to improve the end-to-end throughput of data transfers. A novel mathematical model is used to decide the number of parallel streams to achieve best performance. This model can predict the optimal number of parallel streams with as few as three prediction points. We implement this new servic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
18
0

Year Published

2010
2010
2018
2018

Publication Types

Select...
6
1

Relationship

4
3

Authors

Journals

citations
Cited by 18 publications
(18 citation statements)
references
References 20 publications
0
18
0
Order By: Relevance
“…the performance degradation due to the overhead of opening too many parallel streams). In our previous work, we have developed two highly-accurate models based on Full Second-order [17] and Partial C-order [16]. These models would require as few as three sampling points in the best case to provide very accurate predictions, but in the worst case they could require up to six or seven sampling points for accurate results.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…the performance degradation due to the overhead of opening too many parallel streams). In our previous work, we have developed two highly-accurate models based on Full Second-order [17] and Partial C-order [16]. These models would require as few as three sampling points in the best case to provide very accurate predictions, but in the worst case they could require up to six or seven sampling points for accurate results.…”
Section: Related Workmentioning
confidence: 99%
“…These models lay the foundations of our current work for a highly-accurate and low-overhead prediction model for transfer throughput optimization. Partial C-order_1_2_4 Partial second-order_1_2 The existing prediction models (Partial Second-order [23], Full Second-order [17] and Partial C-order [16]) worked with as few as two or three sampling points, but choosing the best two or three sampling points was a major challenge in those models. If we randomly choose these two or three data points, the resulting approximation may be highly inaccurate (see Figure 1), since there is a high possibility that these random points may not reflect the characteristics of the actual throughput curve.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The systems probe and measurements with external profilers are needed. Complex models are used to calculate the optimum number of multiple streams with the help of sample measurements in order to make a prediction [23,25,26]. Further, network conditions may change over time in the shared environments, and the estimated value might not reflect the most recent state of the system.…”
Section: Application-level Dynamic Tuningmentioning
confidence: 99%
“…The proposed methodology operates without depending on any historical measurements and does not use external profiles for measurement. Instead of using predictive sampling as proposed in [17,25,26], we make use of the instant throughput information gathered from the actual data transfer operations that are currently active. The number of multiple streams is set dynamically in an adaptive manner by gradually increasing the number of concurrent connections up to an optimal point.…”
Section: Application-level Dynamic Tuningmentioning
confidence: 99%