As a hot research topic, sports video classification research has a wide range of applications in switched TV, video on demand, smart TV, and other fields and is closely related to people’s lives. Under this background, sports video classification research has aroused great interest in people. However, the existing methods usually use manual video classification, which the workers themselves often influence. It is challenging to ensure the accuracy of the results, leading to the wrong classification. Due to these limitations, we introduce neural network technology to the automatic classification of sports. This paper proposed a novel attention-based graph convolution-guided third-order hourglass network (AGTH-Net) classification model. First, we designed a kind of figure convolution model based on the attention mechanism. The model is the key to introduce the attention mechanism for neighborhood node weights’ allocation. It reduces the impact of error nodes in the neighborhood while avoiding manual weight assignment. Second, according to the sports complex video image characteristics, we use the third-order hourglass network structure. It is used for the extraction and fusion of multiscale characteristics of sports. In addition, in the hourglass, internal network residual-intensive modules are introduced, realizing characteristics in different levels of network transfer and reuse. It is helpful for maximum details to feature extracting and enhancing the network expression ability. Comparison and ablation experiments are also carried out to prove the effectiveness and superiority of the proposed algorithm.