As there are a far greater number of students in the classroom than teachers, it is difficult for teachers to grasp the learning of all students. Especially during the epidemic period, when online teaching becomes a trend, this problem is more prominent. The issue of how to grasp the learning of each student more comprehensively remains a problem to be solved in teaching. Under such circumstances, students’ facial expression recognition emerges as one of the most important solutions. In this paper, an improved facial expression recognition model based on the multi-head attention mechanism is proposed. The model is tested on two student expression databases, the JAFFE and the OL-SFED, and the recognition rate reaches 99.5% and 100%, respectively. In addition, to compare it with the models developed by other researchers, it is also tested on the RAF-DB, recording the best recognition rate of 90.35% and an average recognition rate of 83.66%, which represents the best level so far.