Human activity recognition is a challenging and active research topic in computer science due to its applications in video surveillance, health monitoring, rehabilitation, human-robot interaction, robotics, gesture and posture analysis, and sports. In the past, various studies have utilized manual features to identify human activities and obtained good accuracy. Nonetheless, the performance of such features degraded in complex situations. Therefore, recent research used deep learning (DL) techniques to capture the local features automatically from given activity instances. Though automatic feature extraction overcomes the problems of manual features, there is still a need to enhance the efficiency and accuracy of existing techniques. The motivation behind this research is to improve the efficiency and accuracy of HAR systems. This research proposed a HAR system, which applies data enhancement techniques before capturing robust and discriminative features set from each activity instance. The captured feature set is given to the transformer model for activities recognition using the PAMAP2, UCI HAR, and WISDM datasets. The achieved results revealed that the proposed HAR model outperformed the baseline methods. Specifically, the proposed HAR achieved 98.2% accuracy for PAMAP2 with all instances in 12 activities, 98.6% accuracy for UCI HAR with all instances in 6 activities, 97.3% for WISDM with all instances in 6 activities. The advantage of the proposed hybrid features is the capability to capture both low-level and high-level information from the sensor data, potentially enhancing the discriminative power of the system. In addition, this study employed a transformer a model due to its ability to capture long-range dependencies, which are beneficial in recognizing complex human activities patterns.