One of the main challenges of using cutting edge medical imaging applications in the clinical setting is the large amount of data processing required. Many of these applications are based on linear algebra computations operating on large data sizes and their execution may require days in a standard CPU. Distributed heterogeneous systems are capable of improving the performance of applications by using the right computationto-hardware mapping. To achieve high performance, hardware platforms are chosen to satisfy the needs of each computation with corresponding architectural features such as clock speed, number of parallel computational units, and memory bandwidth. In this paper we evaluate the performance benefits of using different hardware platforms to accelerate the execution of a transmural electrophysiological imaging algorithm, targeting a standard CPU with GPU and FPGA accelerators. Using this cutting edge medical imaging application as a case study, we demonstrate the importance of making intelligent computation assignments for improved performance. We show that, depending on the size of the data structures the application works with, the usage of an FPGA to run certain computations can make a big difference: a heterogeneous system with all three hardware platforms (CPU+GPU+FPGA) can cut the execution time by half, compared to the best result using one single accelerator (CPU+GPU). In addition, our experimental results show that combining CPU, GPU, and FPGA platforms in a single system achieves a speedup of up to 62x, 2x, and 1605x compared to systems with a single CPU, GPU, or FPGA platform respectively.