This paper reports on the integration of parallel image processing in the ITK library and on improvements to the state-of-the-art of user transparency. In our approach, image processing tasks are wrapped into objects which are passed to the parallel engine. The engine is able to exploit data and task parallelism when executing the tasks on multicores, clusters and/or GPUs. All features necessary for efficient parallel processing are specified by the task objects. The engine can figure out most of the features itself, and is able to check the correctness of the features provided by the user. Interoperation optimization is attained by efficient scheduling of the tasks. The task dependency graph is automatically created at runtime. This is possible by delaying the execution of the tasks and by the intrinsic ITK pipeline updating mechanism. The low-level functions are also made available for the user, as well as a library-independent version.