Facial expression recognition technology is a technology based on the combination of artificial intelligence technology and biology, and it is interdisciplinary research. The appearance of this technology shows the diversity of the development direction and application fields of computer technology, but it also means that the development of facial expression recognition technology needs not only the support of computer technology but also the exploration and progress of biology. At present, to recognize facial expressions a system usually contains three stages, namely, face detection stage, feature extraction stage, and face expression recognition stage. The facial expression recognition algorithm is transformed into a classification problem, and a convolutional neural network (CNN) is adopted, where features of the facial expressions could be automatically extracted. It takes as input a preprocessed image and can directly produce encoded features and predictions, leveraging the end-to-end strategy. It could discard the complicated intermediate modeling process of traditional machine learning. In this work a CNN is implemented for the recognition of facial expression on Fer2013 dataset. Moreover, the effectiveness of different numbers of CNN layers are testified on this problem. It could be concluded that with deeper architectures, the CNN tends to perform better.