Phase unwrapping is the process of converting discontinuous phase data into a continuous image. This procedure is required by any imaging technology that uses phase data such as MRI, SAR or OQM microscopy. Such algorithms often take a significant amount of time to process on a general purpose computer, rendering it difficult to process large quantities of information. This paper compares implementations of a specific phase unwrapping algorithm known as Minimum L P norm unwrapping on a Field Programmable Gate Array (FPGA) and on a Graphics Processing Unit (GPU) for the purpose of acceleration. The computation required involves a matrix preconditioner (based on a DCT transform) and a conjugate gradient calculation along with a few other matrix operations. These functions are partitioned to run on the host or the accelerator depending on the capabilities of the accelerator. The tradeoffs between the two platforms are analyzed and compared to a General Purpose Processor (GPP) in terms of performance, power and cost.