Joint Super-Resolution and Head Pose Estimation for Extreme Low-Resolution Faces

Malakshan, Sahar Rahimi; Ebrahimi, Saadabadi, Mohammad Saeed; Mostofa, Moktari; Soleymani, Sobhan; Nasrabadi, Nasser M.

doi:10.1109/access.2023.3241606

Cited by 13 publications

(5 citation statements)

References 97 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, Malakshan et al [170] presented a completely different novel approach that jointly solves Face Super-Resolution (FSR) and HPE problems. To this end, a Multi-Stage Generative Adversarial Network (MSGAN) has been proposed: it benefits from the pose-aware adversarial loss and the head pose estimation feedback to generate superresolved images that are properly aligned for HPE.…”

Section: Multi-task Methodsmentioning

confidence: 99%

“…It is proven that when the resolution variation increases, the performance on the original High-Resolution (HR) samples drops [8]. Little studies have been conducted on establish a resolution-agnostic HPE framework [170].…”

Section: Issues and Problemsmentioning

confidence: 99%

“…Finally, only Malakshan et al [170] explored the use of generative models, showing that HPE can be effectively solved in conjunction with other face-related tasks typically associated with the generative field. This seems a very interesting possibility that showed promising result in another partially unexplored area of HPE task the extreme low-resolution images.…”

Section: Research Directionsmentioning

confidence: 99%

See 2 more Smart Citations

Deep Learning for Head Pose Estimation: A Survey

Asperti

Filippini

2023

SN COMPUT. SCI.

View full text Add to dashboard Cite

Head pose estimation (HPE) is an active and popular area of research. Over the years, many approaches have constantly been developed, leading to a progressive improvement in accuracy; nevertheless, head pose estimation remains an open research topic, especially in unconstrained environments. In this paper, we will review the increasing amount of available datasets and the modern methodologies used to estimate orientation, with a special attention to deep learning techniques. We will discuss the evolution of the field by proposing a classification of head pose estimation methods, explaining their advantages and disadvantages, and highlighting the different ways deep learning techniques have been used in the context of HPE. An in-depth performance comparison and discussion is presented at the end of the work. We also highlight the most promising research directions for future investigations on the topic.

show abstract

Section: Multi-task Methodsmentioning

confidence: 99%

Section: Issues and Problemsmentioning

confidence: 99%

Section: Research Directionsmentioning

confidence: 99%

See 1 more Smart Citation

Deep Learning for Head Pose Estimation: A Survey

Asperti

Filippini

2023

SN COMPUT. SCI.

View full text Add to dashboard Cite

show abstract

“…Multi-Stage Generative Adversarial Network [33] (MS-GAN) proposed an end-to-end head-posed estimation network and integrated it with the FSR network. Utilizing poseaware adversarial loss and head pose alignment feedback improves the fidelity of non-frontal face images in real-world scenarios.…”

Section: B Deep Learning Based Face Super-resolutionmentioning

confidence: 99%

MSRFSR: Multi-Stage Refining Face Super-Resolution With Iterative Collaboration Between Face Recovery and Landmark Estimation

Hajian,

Aramvith

2024

IEEE Access

View full text Add to dashboard Cite

Face Super-resolution (FSR) models encounter a significant challenge related to extremely low-dimensional (16×16 pixels) and degraded input images. This deficiency in crucial facial details within the low-level and intermediate levels of the FSR model presents obstacles in tasks such as face alignment, landmark detection, and consequently, difficulty in recovering high-frequency details, resulting in unfaithful and unrealistic super-resolved face images. This research proposes an innovative FSR model with strategically designed multi-attention techniques to enhance facial attribute recovery capabilities. The model incorporates a Non-local Module (NL) and residual pixel attention technique at the low-level stage of the FSR model. Simultaneously, a Spatial Feature Transfer (SFT) module refines mid-level features by leveraging spatial information through an iterative interaction process between an attentive module and a landmark estimation network. By strategically utilizing these modules under an iterative collaboration framework, our method effectively addresses challenges in facial detail recovery, demonstrating enhanced model understanding and refined representation. The proposed model is rigorously examined on CelebA, Helen, AFLW2000, and WFLW datasets at scale factors of ×8 and ×16. The results consistently demonstrate the superiority of our proposed Multi-Stage Refining Face Super-Resolution (MSRFSR) model over state-ofthe-art methods through extensive quantitative and qualitative experiments on four datasets and both scales INDEX TERMS Face image super-resolution, non-local attention, residual pixel attention, spatial feature transfer.

show abstract

“…For example, the digital zoom algorithm used in mobile cameras and the image enhancement techniques used in digital devices. Furthermore, this core technology can be applied to a wide range of Computer Vision tasks, which leads to improvements in various Vision tasks, such as object detection [2], [3], medical imaging [4], [5], security and surveillance imaging [6], [7], face recognition [8], [9].…”

Section: Introductionmentioning

confidence: 99%

SRFormer: Efficient Yet Powerful Transformer Network for Single Image Super Resolution

Mehri,

Behjati,

Carpio

et al. 2023

IEEE Access

View full text Add to dashboard Cite

Recent breakthroughs in single image super resolution have investigated the potential of deep Convolutional Neural Networks (CNNs) to improve performance. However, CNNs based models suffer from their limited fields and their inability to adapt to the input content. Recently, Transformer based models were presented, which demonstrated major performance gains in Natural Language Processing and Vision tasks while mitigating the drawbacks of CNNs. Nevertheless, Transformer computational complexity can increase quadratically for high-resolution images, and the fact that it ignores the original structures of the image by converting them to the 1D structure can make it problematic to capture the local context information and adapt it for real-time applications. In this paper, we present, SRFormer, an efficient yet powerful Transformer-based architecture, by making several key designs in the building of Transformer blocks and Transformer layers that allow us to consider the original structure of the image (i.e., 2D structure) while capturing both local and global dependencies without raising computational demands or memory consumption. We also present a Gated Multi-Layer Perceptron (MLP) Feature Fusion module to aggregate the features of different stages of Transformer blocks by focusing on inter-spatial relationships while adding minor computational costs to the network. We have conducted extensive experiments on several super-resolution benchmark datasets to evaluate our approach. SRFormer demonstrates superior performance compared to state-of-the-art methods from both Transformer and Convolutional networks, with an improvement margin of 0.1 ∼ 0.53dB. Furthermore, while SRFormer has almost the same model size, it outperforms SwinIR by 0.47% and inference time by half the time of SwinIR. The code will be available on GitHub.

show abstract

Joint Super-Resolution and Head Pose Estimation for Extreme Low-Resolution Faces

Cited by 13 publications

References 97 publications

Deep Learning for Head Pose Estimation: A Survey

Deep Learning for Head Pose Estimation: A Survey

MSRFSR: Multi-Stage Refining Face Super-Resolution With Iterative Collaboration Between Face Recovery and Landmark Estimation

SRFormer: Efficient Yet Powerful Transformer Network for Single Image Super Resolution

Contact Info

Product

Resources

About