Rumination plays a pivotal role in assessing the health status of ruminants. However, conventional contact devices such as ear tags and pressure sensors raise animal welfare concerns during rumination behavior detection. Deep learning offers a promising solution for non-contact rumination recognition by training neural networks on datasets. We introduce UD-YOLOv5s, an approach for bovine rumination recognition that incorporates jaw skeleton feature extraction techniques. Initially, a skeleton feature extraction method is proposed for the upper and lower jaws, employing skeleton heatmap descriptors and the Kalman filter algorithm. Subsequently, the UD-YOLOv5s method is developed for rumination recognition. To optimize the UD-YOLOv5s model, the traditional intersection over the union loss function is replaced with the generalized one. A self-built bovine rumination dataset is used to compare the performance of three deep learning techniques: mean shift algorithm, mask region-based convolutional neural network, and you only look once version 3 (YOLOv3). The results of the ablation experiment demonstrate that UD-YOLOv5s achieves impressive precision (98.25%), recall (97.75%), and a mean average precision of 93.43%. We conducted a generalization performance evaluation in a controlled experimental environment to ensure fairness, indicating that UD-YOLOv5s converges faster than other models while maintaining comparable recognition accuracy. Moreover, our work reveals that when convergence speed is equal, UD-YOLOv5s outperforms other models regarding recognition accuracy. These findings provide robust support for accurately identifying cattle rumination behavior, showcasing the potential of the UD-YOLOv5s method in advancing ruminant health assessment.