We present a flexible and highly efficient hardware-assisted volume renderer grounded on the original Projected Tetrahedra (PT) algorithm. Unlike recent similar approaches, our method is exclusively based on the rasterization of simple geometric primitives and takes full advantage of graphics hardware. Both vertex and geometry shaders are used to compute the tetrahedral projection, while the volume ray integral is evaluated in a fragment shader; hence, volume rendering is performed entirely on the GPU within a single pass through the pipeline. We apply a CUDA-based visibility ordering achieving rendering and sorting performance of over 6 M Tet/s for unstructured datasets. Furthermore, as each tetrahedron is processed independently, we employ a data-parallel solution which is neither bound by GPU memory size nor does it rely on auxiliary volume information. In addition, iso-surfaces can be readily extracted during the rendering process, and time-varying data are handled without extra burden.