2007 8th IEEE/ACM International Conference on Grid Computing 2007
DOI: 10.1109/grid.2007.4354142
|View full text |Cite
|
Sign up to set email alerts
|

Data placement for scientific applications in distributed environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
49
0

Year Published

2010
2010
2018
2018

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 86 publications
(49 citation statements)
references
References 20 publications
0
49
0
Order By: Relevance
“…Much research interest is concentrated on workflow performance improvement and such strategies as task clustering [5], task/data throttling [13] and data staging [1] are the mostly used techniques to address the issues of performance optimization. In order to improve performance of fine co mputational granularity task scientific workflows, tas k clustering [5] can minimize the completion time of the workflow by reducing the impact of queue waiting t ime.…”
Section: Relat Ed Workmentioning
confidence: 99%
“…Much research interest is concentrated on workflow performance improvement and such strategies as task clustering [5], task/data throttling [13] and data staging [1] are the mostly used techniques to address the issues of performance optimization. In order to improve performance of fine co mputational granularity task scientific workflows, tas k clustering [5] can minimize the completion time of the workflow by reducing the impact of queue waiting t ime.…”
Section: Relat Ed Workmentioning
confidence: 99%
“…A different approach is followed by [11]. Its authors, like those of [4], argue that in order to improve efficiency and also simplify the systems, it is best to separate the data placement and the scheduling machinery.…”
Section: Related Workmentioning
confidence: 99%
“…It is out of the scope of the present work to develop a component that completely solve these challenges but, in the line of [11], we believe that it would be best to decouple job scheduling from data replication tasks. Theses are best handled by existing data placement systems.…”
Section: Proposal For An Enhanced Solutionmentioning
confidence: 99%
“…They range from storage glide-ins (e.g., BADFS [10]) to building application-optimized storage systems (e.g. HDFS [11], BADFS [10]), to building a configurable storage system that is tuned at deployment time to better support a specific application [12], to offering specific data access optimizations (e.g., location-aware scheduling [13], caching, and data placement techniques [14]). Taken in isolation, these efforts do not fully address the problem we face as they are either specific to a class of applications (e.g., HDFS for map-reduce applications), and consequently incapable to support a large set of workflow applications; or enable system-wide optimizations throughout the application runtime, thus inefficiently supporting applications that have different usage patterns for different files.…”
Section: Introductionmentioning
confidence: 99%