Falls are a major risk factor for older adults, increasing morbidity and healthcare costs. Video-based fall-detection systems offer crucial real-time monitoring and assistance. Yet, their deployment faces challenges such as maintaining privacy, reducing false alarms, and providing understandable outputs for healthcare providers. This paper introduces an innovative automated fall-detection framework that includes a Gaussian blur module for privacy preservation, an OpenPose module for precise pose estimation, a short-time Fourier transform (STFT) module to capture frames with significant motion selectively, and a computationally efficient one-dimensional convolutional neural network (1D-CNN) classification module designed to classify these frames. Additionally, integrating a gradient-weighted class activation mapping (GradCAM) module enhances the system’s explainability by visually highlighting the movement of the key points, resulting in classification decisions. Modular flexibility in our system allows customization to meet specific privacy and monitoring needs, enabling the activation or deactivation of modules according to the operational requirements of different healthcare settings. This combination of STFT and 1D-CNN ensures fast and efficient processing, which is essential in healthcare environments where real-time response and accuracy are vital. We validated our approach across multiple datasets, including the Multiple Cameras Fall Dataset (MCFD), the UR fall dataset, and the NTU RGB+D Dataset, which demonstrates high accuracy in detecting falls and provides the interpretability of results.