Video‐based emotion recognition has been a long‐standing research topic for computer scientists and psychiatrists. In contrast to traditional discrete emotional models, emotion recognition based on continuous emotional models can better describe the progression of emotions. Quantitative analysis of emotions will have crucial impacts on promoting the development of intelligent products. The current solutions to continuous emotion recognition still have many issues. The original continuous emotion dataset contains incomplete data annotations, and the existing methods often ignore temporal information between frames. The following measures are taken in response to the above problems. Initially, aiming at the problem of incomplete video labels, the correlation between discrete and continuous video emotion labels is used to complete the dataset labels. This correlation is used to propose a mathematical model to fill the missing labels of the original dataset without adding data. Moreover, this paper proposes a continuous emotion recognition network based on an optimized temporal convolutional network, which adds a feature extraction submodule and a residual module to retain shallow features while improving the feature extraction ability. Finally, validation experiments on the Aff‐wild2 dataset achieved accuracies of 0.5159 and 0.65611 on the valence and arousal dimensions, respectively, by adopting the above measures.