Convolution neural network has become a hot research topic in the field of computer vision because of its superior performance in image classification. Based on the above background, the purpose of this paper is to analyze sports sequence images based on convolutional neural network. In view of the low detection rate of single-frame and the complexity of multiframe detection algorithms, this paper proposes a new algorithm combining single-frame detection and multiframe detection, so as to improve the detection rate of small targets and reduce the detection time. Based on the traditional residual network, an improved, multiscale, residual network is proposed in this paper. The network structure enables the convolution layer to “observe” data from different scales and obtain more abundant input features. Moreover, the depth of the network is reduced, the gradient vanishing problem is effectively suppressed, and the training difficulty is reduced. Finally, the ensemble learning method of relative majority voting is used to reduce the classification error rate of the network to 3.99% on CIFAR-10, and the error rate is reduced by 3% compared with the original residual neural network.