IntroductionSurface Electromyographic (sEMG) signals are widely utilized for estimating finger kinematics continuously in human-machine interfaces (HMI), and deep learning approaches are crucial in constructing the models. At present, most models are extracted on specific subjects and do not have cross-subject generalizability. Considering the erratic nature of sEMG signals, a model trained on a specific subject cannot be directly applied to other subjects. Therefore, in this study, we proposed a cross-subject model based on the Rotary Transformer (RoFormer) to extract features of multiple subjects for continuous estimation kinematics and extend it to new subjects by adversarial transfer learning (ATL) approach.MethodsWe utilized the new subject’s training data and an ATL approach to calibrate the cross-subject model. To improve the performance of the classic transformer network, we compare the impact of different position embeddings on model performance, including learnable absolute position embedding, Sinusoidal absolute position embedding, and Rotary Position Embedding (RoPE), and eventually selected RoPE. We conducted experiments on 10 randomly selected subjects from the NinaproDB2 dataset, using Pearson correlation coefficient (CC), normalized root mean square error (NRMSE), and coefficient of determination (R2) as performance metrics.ResultsThe proposed model was compared with four other models including LSTM, TCN, Transformer, and CNN-Attention. The results demonstrated that both in cross-subject and subject-specific cases the performance of RoFormer was significantly better than the other four models. Additionally, the ATL approach improves the generalization performance of the cross-subject model better than the fine-tuning (FT) transfer learning approach.DiscussionThe findings indicate that the proposed RoFormer-based method with an ATL approach has the potential for practical applications in robot hand control and other HMI settings. The model’s superior performance suggests its suitability for continuous estimation of finger kinematics across different subjects, addressing the limitations of subject-specific models.