Abstract. It is often difficult to perform efficiently a collection of jobs with complex job dependencies due to temporal unpredictability of the grid. One way to mitigate the unpredictability is to schedule job execution in a manner that constantly maximizes the number of jobs that can be sent to workers. A recently developed scheduling theory provides a basis to meet that optimization goal. Intuitively, when the number of such jobs is always large, high parallelism can be maintained, even if the number of workers changes over time in an unpredictable manner. In this paper we present the design, implementation, and evaluation of a practical scheduling tool inspired by the theory. Given a DAGMan input file with interdependent jobs, the tool prioritizes the jobs. The resulting schedule significantly outperforms currently least 13% faster with 95% confidence. An implementation of the tool was integrated with the Condor high-throughput computing system.
Earlier work has developed the underpinnings of IC-Scheduling Theory, a framework for scheduling computations having intertask dependencies-modeled via dagsfor Internet-based computing. The goal of the schedules produced is to render tasks eligible for execution at the maximum possible rate, with the dual aim of: (a) utilizing remote clients' computational resources well, by always having work to allocate to an available client; (b) lessening the likelihood of a computation's stalling for lack of eligible tasks. The dags handled by the Theory thus far are those that can be decomposed into a given collection of bipartite building-block dags via the operation of dag-decomposition. A basic tool in constructing schedules is a relation ⊲, which allows one to "prioritize" the scheduling of a complex dag's building blocks. The current paper extends IC-Scheduling Theory in two ways: by expanding significantly the repertoire of dags that the Theory can schedule optimally, and by allowing one sometimes to shortcut the algorithmic process required to find optimal schedules. The expanded repertoire now allows the Theory to schedule optimally, among other dags, a large range of dags that are either "expansive," in the sense that they grow outward from their sources, or "reductive," in the sense that they grown inward toward their sinks. The algorithmic shortcuts allow one to "read off" an optimal schedule for a dag from a given optimal schedule for the dag's dual, which is obtained by reversing all arcs (thereby exchanging the roles of sources and sinks).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.