Many deep learning (DL) models have shown exceptional promise in radar-based human activity recognition (HAR) area. For radar-based HAR, the raw data is generally converted into a 2-D spectrogram by using short-time Fourier transform (STFT). All the existing DL methods treat the spectrogram as an optical image, and thus the corresponding architectures such as 2-D convolutional neural networks (2D-CNNs) are adopted in those methods. These 2-D methods that ignore temporal characteristics ordinarily lead to a complex network with a huge amount of parameters but limited recognition accuracy. In this paper, for the first time, the radar spectrogram is treated as a time sequence with multiple channels. Hence, we propose a DL model composed of 1-D convolutional neural networks (1D-CNNs) and long shortterm memory (LSTM). The experiments results show that the proposed model can extract spatio-temporal characteristics of the radar data and thus achieves the best recognition accuracy and relatively low complexity compared to the existing 2D-CNN methods. INDEX TERMS Radar signal processing, human activity recognition, convolutional neural network, recurrent neural network, deep learning.