Objective: To develop a robust and effective computer vision system that can automatically identify and classify human actions in video data, considering the temporal dynamics and various environmental conditions. This technology has numerous applications in surveillance, human-computer interaction, and video analysis. Methods: The key methods for dense trajectory extraction include the dense optical flow, which computes motion vectors for each point, and the use of key point detectors like the Scale-Invariant Feature Transform (SIFT) or the Harris corner detector. Findings: By describing the motion of the trajectories, trajectory descriptors produce remarkably strong results on their own, such as 90.2% on KTH and 47.7% on Hollywood2 for dense trajectories. This demonstrates the significance of the motion data present in the local trajectory patterns. Because the trajectory descriptors catch a lot of camera motion, we only report 67.2% on YouTube. Novelty: In this study, a method for modelling movies that combines dense sampling and feature tracking is presented. Compared to earlier video descriptions, our dense trajectories are more reliable. They effectively capture the motion data in the movies and outperform cutting-edge action categorization techniques in terms of performance.