Video analytics has become a critical area of study in the domain of computer vision due to the availability of abundant video data. Automating human activity recognition from video footage is becoming increasingly popular due to its use in the fields of video surveillance, healthcare, and industry. Neural network models are currently being used in a varied range of scientific, academic, and commercial applications to solve image processing problems. One of the key benefits of 3D convolution neural networks (3D CNN) is their capability to learn hierarchical representations of spatiotemporal features. In this presented work, we proposed a novel 3D CNN model for detecting human activity from video sequences. A key contribution of our research is the development of a pre-processing technique including key frame selection, background segmentation, and the modeling and training of an efficient 3D Convolutional Neural Network for classifying human activities. The proposed model is tested on benchmark datasets like KTH, Weizmann, and UT-I. The performance of the model in handling challenges in datasets is also evaluated. Our proposed technique demonstrates superior recognition accuracy and training speed compared to reference methods. The proposed method has promising applications in surveillance, healthcare, sports analysis, and human-computer interaction, where accurate activity recognition is vital.