Please cite this article as: D. Castro, et al., Automatically deriving cost models for structured parallel processes using hylomorphisms, Future Generation Computer Systems (2017), http://dx.doi.org/10.1016/j.future. 2017.04.035 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. An operational semantics of a queue language for streaming computations. An operational semantics of algorithmic skeletons using this queue language. Derive cost equations for algorithmic skeletons from the operational semantics. Cost equations for algorithmic skeletons are combined with sized types.
AbstractStructured parallelism using nested algorithmic skeletons can greatly ease the task of writing parallel software, since common, but hard-to-debug, problems such as race conditions are eliminated by design. However, choosing the best combination of algorithmic skeletons to yield good parallel speedups for a specific program on a specific parallel architecture is still a difficult problem. This paper uses the unifying notion of hylomorphisms, a general recursion pattern, to make it possible to reason about both the functional correctness properties and the extra-functional timing properties of structured parallel programs. We have previously used hylomorphisms to provide a denotational semantics for skeletons, and proved that a given parallel structure for a program satisfies functional correctness. This paper expands on this theme, providing a simple operational semantics for algorithmic skeletons and a cost semantics that can be automatically derived from that operational semantics. We prove that both semantics are sound with respect to our previously defined denotational semantics. This means that we can now automatically and statically choose a provably optimal parallel structure for a given program with respect to a cost model for a (class of) parallel architecture. By deriving an automatic amortised analysis from our cost model, we can also accurately predict parallel runtimes and speedups.