Temporal modeling is the essential to achieve video super-resolution. Most models use alignment or recurrent methods to directly exploit the temporal information of consecutive frames. However, the feature information extracted directly from the input frames is coarse, which affects the performance and generalization ability of the model. Thus, in this paper, we propose to investigate the role of recurrent feature updating networks. The feature update module is proposed to update and optimize the input information to improve the accuracy of features. We design the model as a multi-stage form to improve performance and generalization. Moreover, to further improve the performance of the model, we propose the difference supplement module. It allows the model to exploit the temporal differences between groups to complement the missing information in the output of each stage. Experiments demonstrate that our proposed model achieves 27.78 dB, 30.44 dB, and 39.38 dB on Vid4, SPMCS, and UDM10, respectively, which indicates that our proposed model achieves state-of-the-art performance on video super-resolution. Code is available at https://github.com/gfli123/RFUN.