This study was carried out with the aim of exploring the full-convolution Siamese network (SiamFC) in the application of neonatal facial video image tracking, achieving accurate recognition of neonatal pain and helping doctors evaluate neonatal emotions in an automatic manner. The current technology shows low accuracy on facial image recognition of newborns, so the SiamFC algorithm under the deep learning was optimized in this study. Besides, a newborn facial video image tracking model (FVIT model) was constructed based on the SiamFC algorithm in combination with the attention mechanism with face tracking algorithm, and the facial features of newborns were tracked and recognized. In addition, a newborn face database was constructed based on the adult face database to evaluate performance of the FVIT model. It was found that the accuracy of the improved algorithm is 0.889, higher by 0.036 in contrast to other models; the area under the curve (AUC) of success rate reaches 0.748, higher by 0.075 compared with other algorithms. What's more, the improved algorithm shows good performance in tracking the facial occlusion, facial expression changes, and scale conversion of newborns. Therefore, the improved algorithm shows higher accuracy and success rate and has good effect in capturing and tracking the facial images of newborns, thereby providing an experimental basis for facial recognition and pain assessment of newborns in the later stage.