Thinning is an important task in many image processing applications, including remote sensing, photogrammetry, optical character recognition, and medical imaging. In this study, we compare the performance of thinning algorithms on parallel hardware. Grayscale thinning involves a substantial amount of computation per pixel, and may be accelerated in several ways: algorithmic improvements, code optimization, and parallelization. We describe an algorithmic improvement that speeds up grayscale thinning several-fold, and demonstrate scalable acceleration from multi-core CPU concurrency libraries (such as OpenMP), coprocessor hardware (such as the Xeon Phi), and GPUs (such as CUDA-enabled NVIDIA graphics cards). GPU processing appears to offer the most cost-effective approach for high performance grayscale thinning applications.