This paper proposes an approach to improve the performance of no-reference video quality assessment for sports videos with dynamic motion scenes using an efficient spatiotemporal model. In the proposed method, we divide the video sequences into video blocks and apply a 3D shearlet transform that can efficiently extract primary spatiotemporal features to capture dynamic natural motion scene statistics from the incoming video blocks. The concatenation of a deep residual bidirectional gated recurrent neural network and logistic regression is used to learn the spatiotemporal correlation more robustly and predict the perceptual quality score. In addition, conditional video block-wise constraints are incorporated into the objective function to improve quality estimation performance for the entire video. The experimental results show that the proposed method extracts spatiotemporal motion information more effectively and predicts the video quality with higher accuracy than the conventional no-reference video quality assessment methods.