Addressing the issue of low recognition accuracy in human motion detection when relying on a single feature, a novel approach integrating Frequency Modulated Continuous Wave (FMCW) radar technology with a Residual Network (ResNet) architecture has been proposed. This method commences by capturing the echo signals of six distinct human motions using an FMCW radar. These signals undergo preprocessing, followed by the application of a two-dimensional Fourier transform to derive the Range-time Map (RTM) and Doppler-time Map (DTM) representations of the human motions. To enhance the extraction and precise identification of human motion features, the conventional single-channel input structure of convolutional neural networks has been refined. Specifically, the ResNet18 residuals have been upgraded by incorporating Inception V1 modules. Furthermore, the Convolutional Block Attention Module (CBAM) has been integrated to engineer a dual-channel fusion residual network capable of recognizing and classifying human motions effectively. Empirical results demonstrate that the recognition accuracy of human motion detection has been enhanced by 1–4% when employing this dual-feature fusion structure, as compared to single-feature domain recognition. This improvement attests to the robust recognition capabilities of the proposed model.