Automating human activity recognition is one of computer vision's most appealing and pragmatic research areas. In this article, we have addressed the problem of video-based student activity detection. The student’s activity detection using YOLO (SADY) aims to recognize the normal and abnormal student activities to ensure immediate intervention in case of any risk or necessity. We created our classroom data set of around 220 recordings depicting seven student classroom activities. The YOLOv4 Tiny model was retrained using 5000 labeled keyframes extracted from the train videos. The model was then tested for single or multiple activity detections. We presented the evaluated results for various values of hyperparameters like confidence threshold and Intersection Over Union (IoU) thresholds for the proposed model. The model assigns a unique confidence score and action label to each frame for the test videos by positioning recurrent activity labels. The proposed approach achieved a mean average precision (mAP) of 95% and a frame per second rate (FPS) of 45 for the student activity Class Room (CR) dataset and mAP of 95.18 % for the LIRIS dataset. The experimental findings using the Class Room recorded and LIRIS publicly accessible dataset show that our proposed approach outperforms existing approaches regarding recognition accuracy and speed. The comparable results obtained in this research work imply that the proposed framework could effectively monitor student’s activities in schools, colleges, and universities.