2018
DOI: 10.1145/3177851
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Perspective on Pilot-Job Systems

Abstract: Pilot-Job systems play an important role in supporting distributed scientific computing. They are used to consume more than 700 million CPU hours a year by the Open Science Grid communities, and by processing up to 1 million jobs a day for the ATLAS experiment on the Worldwide LHC Computing Grid. With the increasing importance of task-level parallelism in high-performance computing, Pilot-Job systems are also witnessing an adoption beyond traditional domains. Notwithstanding the growing impact on scientific re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
70
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
3

Relationship

6
4

Authors

Journals

citations
Cited by 69 publications
(70 citation statements)
references
References 156 publications
(145 reference statements)
0
70
0
Order By: Relevance
“…Ref. [17] provided the architectural paradigm for pilot systems, however it is still unclear how an analogous paradigm would complement the work done on reference architectures for workflow systems [10], [8], and whether, given the very broad diversity of workflow systems and tools, we can even formulate a single architectural paradigm. This paradigm has been elusive so far, but it might be more fruitful to formulate system-level paradigms that have the properties of building blocks.…”
Section: Discussionmentioning
confidence: 99%
“…Ref. [17] provided the architectural paradigm for pilot systems, however it is still unclear how an analogous paradigm would complement the work done on reference architectures for workflow systems [10], [8], and whether, given the very broad diversity of workflow systems and tools, we can even formulate a single architectural paradigm. This paradigm has been elusive so far, but it might be more fruitful to formulate system-level paradigms that have the properties of building blocks.…”
Section: Discussionmentioning
confidence: 99%
“…EnTK provides the ability to create and execute ensemble-based workflows/applications with diverse coordination and communication algorithms, abstracting the need for explicit resource management. EnTK uses RP as a pilot-based [28] runtime system to provide resource management and task execution capabilities. In turn, RP uses RS as an access layer towards HPC resources.…”
Section: A Radical-cybertools: Ensemble Execution On Summitmentioning
confidence: 99%
“…RADICAL-Pilot [25] is a Pilot system that implements the pilot paradigm as outlined in Ref. [39]. RADICAL-Pilot (RP) is implemented in Python and provides a well defined API and usage modes.…”
Section: Radical-pilotmentioning
confidence: 99%