Muscle synergy analysis for gesture recognition is a fundamental research area in human-machine interaction, particularly in fields such as rehabilitation. However, previous methods for analyzing muscle synergy are typically not end-to-end and lack interpretability. Specifically, these methods involve extracting specific features for gesture recognition from surface electromyography (sEMG) signals and then conducting muscle synergy analysis based on those features. Addressing these limitations, we devised an end-to-end framework, namely Shapley-value-based muscle synergy (SVMS), for muscle synergy analysis. Our approach involves converting sEMG signals into grayscale sEMG images using a sliding window. Subsequently, we convert adjacent grayscale images into color images for gesture recognition. We then use the gradient-weighted class activation mapping (Grad-CAM) method to identify significant feature areas for sEMG images during gesture recognition. Grad-CAM generates a heatmap representation of the images, highlighting the regions that the model uses to make its prediction. Finally, we conduct a quantitative analysis of muscle synergy in the specific area obtained by Grad-CAM based on the Shapley value. The experimental results demonstrate the effectiveness of our SVMS method for muscle synergy analysis. Moreover, we are able to achieve a recognition accuracy of 94.26% for twelve gestures while reducing the required electrode channel information from ten to six dimensions and the analysis rounds from about 1000 to nine.