Music can generate a positive effect in runners’ performance and motivation. However, the practical implementation of music intervention during exercise is mostly absent from the literature. Therefore, this paper designs a playback sequence system for joggers by considering music emotion and physiological signals. This playback sequence is implemented by a music selection module that combines artificial intelligence techniques with physiological data and emotional music. In order to make the system operate for a long time, this paper improves the model and selection music module to achieve lower energy consumption. The proposed model obtains fewer FLOPs and parameters by using logarithm scaled Mel-spectrogram as input features. The accuracy, computational complexity, trainable parameters, and inference time are evaluated on the Bi-modal, 4Q emotion, and Soundtrack datasets. The experimental results show that the proposed model is better than that of Sarkar et al. and achieves competitive performance on Bi-modal (84.91%), 4Q emotion (92.04%), and Soundtrack (87.24%) datasets. More specifically, the proposed model reduces the computational complexity and inference time while maintaining the classification accuracy, compared to other models. Moreover, the size of the proposed model for network training is small, which can be applied to mobiles and other devices with limited computing resources. This study designed the overall playback sequence system by considering the relationship between music emotion and physiological situation during exercise. The playback sequence system can be adopted directly during exercise to improve users’ exercise efficiency.