In heterogeneous systems, load balancing policies allow acceleration of tasks by distributing work among devices, thus delivering performance and energy efficiency. However, a key challenge that remains is programmability; specifically, releasing the programmer from the burden of managing data and devices with different architectures.To this end, we extend EngineCL, a high-level framework built on top of OpenCL to support FPGA devices. Our proposal fully integrates FPGAs into the framework, enabling effective cooperation between CPU, GPU, and FPGA devices. With command overlapping and judicious data management, our work improves performance by up to 96% compared with single device execution and delivers energy-delay gains of up to 36%. Besides, adopting FPGAs does not require programmers to make big changes in their applications because the extensions do not modify the user-facing interface of EngineCL.