Bounding volume hierarchy (BVH) has been widely adopted as the acceleration structure in broad‐phase collision detection. Previous state‐of‐the‐art BVH‐based collision detection approaches exploited the spatio‐temporal coherence of simulations by maintaining a bounding volume test tree (BVTT) front. A major drawback of these algorithms is that large deformations in the scenes decrease culling efficiency and slow down collision queries. Moreover, for front‐based methods, the inefficient caching on GPU caused by the arbitrary layout of BVH and BVTT front nodes becomes a critical performance issue. We present a fast and robust BVH‐based collision detection scheme on GPU that addresses the above problems by ordering and restructuring BVHs and BVTT fronts. Our techniques are based on the use of histogram sort and an auxiliary structure BVTT front log, through which we analyze the dynamic status of BVTT front and BVH quality. Our approach efficiently handles inter‐ and intra‐object collisions and performs especially well in simulations where there is considerable spatio‐temporal coherence. The benchmark results demonstrate that our approach is significantly faster than the previous BVH‐based method, and also outperforms other state‐of‐the‐art spatial subdivision schemes in terms of speed.