Genetic Programming (GP) is a computationally intensive technique which also has a high degree of natural parallelism. Parallel computing architectures have become commonplace especially with regards Graphics Processing Units (GPU). Hence, versions of GP have been implemented that utilise these highly parallel computing platforms enabling significant gains in the computational speed of GP to be achieved. However, recently a two dimensional stack approach to GP using a multicore CPU also demonstrated considerable performance gains. Indeed, performances equivalent to or exceeding that achieved by a GPU were demonstrated. This paper will demonstrate that a similar two dimensional stack approach can also be applied to a GPU based approach to GP to better exploit the underlying technology. Performance gains are achieved over a standard single dimensional stack approach when utilising a GPU. Overall, a peak computational speed of over 55 billion Genetic Programming Operations per Second are observed, a two fold improvement over the best GPU based single dimensional stack approach from the literature.