In recent years, human motion recognition based on wireless sensing has been widely used in smart home, virtual reality and other fields. Among them, human motion recognition based on WiFi channel state information (CSI) has received great attention. Although the current large-action recognition method based on CSI has good performance, the recognition accuracy is low when performing fine-grained human gesture recognition, which is due to the difficulty of collecting CSI human gesture perception dataset and the small number of high-quality datasets. Therefore, we convert CSI data into images and augment CSI images using super-resolution image generation methods. At the same time, we propose a SCBAM attention mechanism to further enhance the attention of the model to important features. In the experiment, we performed experimental analysis using the publicly available Widar3 dataset. The results show that the recognition accuracy of our mixed attention model in the same scene, across location, direction and scene reaches 99.73%, 97.41%, 94.57% and 95.98%, which is 0.07%, 0.29%, 0.05% and 0.99% higher than before data enhancement.