There is an increasing interest in enhancing the quality of low-resolution (LR) facial images for various social life applications. Existing methods often use domainspecific prior knowledge, which is effective in improving the face super-resolution model's performance.However, it is challenging to obtain rich and accurate prior information from LR inputs in real-world scenarios, which can limit the robustness and generalization ability of the developed face super-resolution model. In this paper, a multisource reference-based face super-resolution Network, namely MSRNet, is proposed. Without considering the prior knowledge of faces, the network can reconstruct a LR face image with a magnitude factor of 8 under the guidance of multiple reference face images of different identities.By constructing an "appearance-alike" reference data set Face_Ref, the designed MSRNet aims to fully exploit the local and spatially similar high frequency information between the distinct references and the current face. More specifically, to effectively combine the information from multiple references, a cross-scale and cross-space feature fusion mechanism is introduced for external and internal references, and then the enhanced local semantics are finally incorporated into the high-resolution face reconstruction. The