Facial image super-resolution (SR) is an important aspect of facial analysis, and it can contribute significantly to tasks such as face alignment, face recognition, and image-based 3D reconstruction. Recent convolutional neural network (CNN) based models have exhibited significant advancements by learning mapping relations using pairs of low-resolution (LR) and high-resolution (HR) facial images. However, because these methods are conventionally aimed at increasing the PSNR and SSIM metrics, the reconstructed HR images might be blurry and have an overall unsatisfactory perceptual quality even when state-of-the-art quantitative results are achieved. In this study, we address this limitation by proposing an adversarial framework intended to reconstruct perceptually high-quality HR facial images while simultaneously removing blur. To this end, a simple five-layer CNN is employed to extract feature maps from LR facial images, and this feature information is provided to two-branch encoder-decoder networks that generate HR facial images with and without blur. In addition, local and global discriminators are combined to focus on the reconstruction of HR facial structures. Both qualitative and quantitative results demonstrate the effectiveness of the proposed method for generating photorealistic HR facial images from a variety of LR inputs. Moreover, it was also verified, through a use case scenario that the proposed method can contribute more to the field of face recognition than existing approaches.