A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models

Gao, J.; Tang, Ni; Zhang, Dongxiao

doi:10.3390/app13148110

Cited by 2 publications

(1 citation statement)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Inspired by the great success of generative adversarial networks (GANs) in the image processing field [16][17][18], Yang et al [19] introduced a collaborative suppression and replenishment framework based on GANs. Gao et al [20] proposed a conditional generative model based on the diffusion model, which replaces the U-Net in super-resolution to capture complex details and fine textures. Addressing the fact that GAN-based methods require greater computational resources, PCA-SRGAN [21] uses Principal Component Analysis decomposition, while SPGAN [22] employs supervised pixel-wise loss to ease the GAN training process.…”

Section: Introductionmentioning

confidence: 99%

Feature Maps Need More Attention: A Spatial-Channel Mutual Attention-Guided Transformer Network for Face Super-Resolution

Zhang,

2024

Applied Sciences

View full text Add to dashboard Cite

Recently, transformer-based face super-resolution (FSR) approaches have achieved promising success in restoring degraded facial details due to their high capability for capturing both local and global dependencies. However, while existing methods focus on introducing sophisticated structures, they neglect the potential feature map information, limiting FSR performance. To circumvent this problem, we carefully design a pair of guiding blocks to dig for possible feature map information to enhance features before feeding them to transformer blocks. Relying on the guiding blocks, we propose a spatial-channel mutual attention-guided transformer network for FSR, for which the backbone architecture is a multi-scale connected encoder–decoder. Specifically, we devise a novel Spatial-Channel Mutual Attention-guided Transformer Module (SCATM), which is composed of a Spatial-Channel Mutual Attention Guiding Block (SCAGB) and a Channel-wise Multi-head Transformer Block (CMTB). SCATM on the top layer (SCATM-T) aims to promote both local facial details and global facial structures, while SCATM on the bottom layer (SCATM-B) seeks to optimize the encoded features. Considering that different scale features are complementary, we further develop a Multi-scale Feature Fusion Module (MFFM), which fuses features from different scales for better restoration performance. Quantitative and qualitative experimental results on various datasets indicate that the proposed method outperforms other state-of-the-art FSR methods.

show abstract

Section: Introductionmentioning

confidence: 99%

Feature Maps Need More Attention: A Spatial-Channel Mutual Attention-Guided Transformer Network for Face Super-Resolution

Zhang,

2024

Applied Sciences

View full text Add to dashboard Cite

show abstract

Joint Motion Deblurring and Super-Resolution for Single Image Using Diffusion Model and GAN

Zhang,

Tang,

2024

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models

Cited by 2 publications

References 56 publications

Feature Maps Need More Attention: A Spatial-Channel Mutual Attention-Guided Transformer Network for Face Super-Resolution

Feature Maps Need More Attention: A Spatial-Channel Mutual Attention-Guided Transformer Network for Face Super-Resolution

Joint Motion Deblurring and Super-Resolution for Single Image Using Diffusion Model and GAN

Contact Info

Product

Resources

About