Human activity recognition (HAR) is an important research area with a wide range of application scenarios, such as smart homes, healthcare, abnormal behavior detection, etc. Wearable sensors, computer vision, radar, and other technologies are commonly used to detect human activity. However, they are severely limited by issues such as cost, lighting, context, and privacy. Therefore, this paper explores a high-performance method of using channel state information (CSI) to identify human activities, which is a deep learning-based spatial module-temporal convolutional network (SM-TCNNET) model. The model consists of a spatial feature extraction module and a temporal convolutional network (TCN) that can extract the spatiotemporal features in CSI signals well. In this paper, extensive experiments are conducted on the self-picked dataset and the public dataset (StanWiFi), and the results show that the accuracy reaches 99.93% and 99.80%, respectively. Compared with the existing methods, the recognition accuracy of the SM-TCNNET model proposed in this paper is improved by 1.8%.