Real time rendering of three-dimensional scenes in high photorealistic details is a hard task, such as in the ray tracing rendering algorithm. In general, the performance achieved by a sequential software-based implementation of ray tracing is far from satisfactory. However, parallel implementations of ray tracing have been enabling reasonable real time performance, as the algorithm is embarrassingly parallel. Thus, a custom parallel design in hardware is likely to achieve an even higher performance. In this paper, we propose a hardware parallel architecture capable of dealing with the main desirable features of ray tracing, such as shadows and reflection effects, imposing low area cost and a promising rendering performance. Such architecture, called GridRT, is based on the Uniform Grid acceleration structure and is intended to deliver massive parallelism through parallel ray-triangle intersection tests as well as parallel processing of many rays. A hardware implementation of the proposed architecture is presented, together with some performance results and resources requirements. The rendering is reduced by 80% using a grid configuration of eight processing elements.