“…Since, YOLOv7 is a new state-of-the-art real-time object detection method (Wang, et al, 2023), hence we have used YOLOv7 to detect all the students present in the classroom and also further detect the face of each identi ed student for facial expression and eye state recognition or classi cation. And implement face recognition with the help of VGGFace (Simonyan & Zisserman, 2015; Parkhi, et al, 2015) model to identify each individual student's present in the classroom; the reason behind choosing VGGFace model for face recognition is that, many studies shows that VGGFace model has comparatively more accurate than other pre-trained models (Ghazi & Ekenel, 2016;Goel, et al, 2021;Grm, et al, 2018;Chandra & Reddy, 2020). And we performed Facial Expression classi cation using the ResNet neural network architecture (He, et al, 2016), because as per many studies the ResNet architecture could be a leading network architecture for image classi cation (Rahman, et al, 2020;Schieck, et al, 2023;Yang, et al, 2021;Mascarenhas & Agarwal, 2021).…”