This paper describes COLT HPF , a run-time support specifically designed for the co-ordination of concurrent and communicating HPF tasks. COLT HPF is implemented on top of MPI and requires only small changes to the run-time support of the HPF compiler used. Although the COLT HPF API can be used directly by programmers to write applications as a flat collection of interacting data-parallel tasks, we believe that it can be used more productively through a compiler of a simple high-level co-ordination language which facilitates programmers in structuring a set of data-parallel HPF tasks according to common forms of task-parallelism. The paper outlines design and implementation issues, and discusses the main differences from other approaches to exploiting task parallelism in the HPF framework. We show how COLT HPF can be used to implement common forms of parallelism, e.g. pipeline and processor farms, and we present experimental results regarding both synthetic micro-benchmarks and sample applications. The experiments were conducted on an SGI/Cray T3E using Adaptor, a public domain HPF compiler.