Through-wall radar human body pose recognition technology has broad applications in both military and civilian sectors. Identifying the current pose of targets behind walls and predicting subsequent pose changes are significant challenges. Conventional methods typically utilize radar information along with machine learning algorithms such as SVM and random forests to aid in recognition. However, these approaches have limitations, particularly in complex scenarios. In response to this challenge, this paper proposes a cross-modal supervised through-wall radar human body pose recognition method. By integrating information from both cameras and radar, a cross-modal dataset was constructed, and a corresponding deep learning network architecture was designed. During training, the network effectively learned the pose features of targets obscured by walls, enabling accurate pose recognition (e.g., standing, crouching) in scenarios with unknown wall obstructions. The experimental results demonstrated the superiority of the proposed method over traditional approaches, offering an effective and innovative solution for practical through-wall radar applications. The contribution of this study lies in the integration of deep learning with cross-modal supervision, providing new perspectives for enhancing the robustness and accuracy of target pose recognition.