With the rapid development of artificial intelligence technology, emotion recognition has been applied in all aspects of life, using eye movement tracking technology for emotion recognition has become an important branch of emotion computing. In order to explore the relationship between eye movement signals and learners' emotional states in the online video learning environment, we used machine learning and convolutional neural network methods to recognize eye movement signals, and classify learners' emotional states into two categories, positive and negative. The study of eye movement data under different time windows mainly includes four stages: data collection, data preprocessing, classifier modeling, training and testing. In this paper, a Eye-movement Feature Extraction Classification Network(EFECN) based on convolutional neural network is proposed for small sample data and the classification of emotion state based on eye movement. The eye movement data were transformed into images through cross-modal conversion as input of multiple different deep convolutional neural networks, and the emotional states were classified in positive and negative directions. The accuracy was used as the evaluation index to evaluate and compare the different models. The accuracy of the eye movement emotion recognition model reached 72% in the SVM model and 91.62% in the EFECN model. Experimental results show that the convolutional neural network based on deep learning has a significant improvement in recognition accuracy compared with traditional machine learning methods.