This paper focuses on deep vision sensing-assisted gesture recognition for athletes in dynamic scenes. Although many research attention had been devoted to this field in recent years, most of existing works failed to fully take characteristics of dynamic scenes into consideration. To deal with this challenge, this paper proposes a diffusion convolution neural network-based multiview gesture recognition approach in dynamic scenes. For one thing, the dynamic spatiotemporal slice position selection based on the body mask heatmap is adopted to calculate positions of horizontal and vertical slices. Thus, the dynamic selection of slice positions in two directions can be realized, and then the extraction of bi-directional spatiotemporal slice images can be completed. For another, action sequences through the 3D residual neural network are learned, and the spatiotemporal information among frames are mined through recurrent networks. Through their combination, a multi-view gesture recognition approach for athletes is constructed. In the experiments, two standard datasets UCF101 and HMDB51 are utilized to establish simulation environment. The proposed method can reach the accuracy beyond 95% on the two datasets. Compared with several typical recognition methods, the proposed method shows higher accuracy.