Enhancing Remote Sensing Image Super-Resolution with Efficient Hybrid Conditional Diffusion Model

Han, Lintao; Zhao, Yuchen; Lv, Hengyi; Zhang, Yisa; Liu, Hailong; Bi, Guoling; Han, Qi

doi:10.3390/rs15133452

Cited by 17 publications

(7 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To better enhance the image restoration capability of diffusion models, existing works [ 43 , 44 ] incorporate latent features from conditional neural networks into training diffusion models. Specifically, the method extracts integrated features from low-resolution images through a neural network for conditioning to guide image generation.…”

Section: Related Workmentioning

confidence: 99%

“…If they are directly linearly or nonlinearly combined, the desired performance results cannot be obtained. The existing methods, such as those in [ 43 , 52 ], that fuse convolutional neural networks with diffusion models directly fuse features from the two domains with gaps, which will inevitably lead to image distortions and detail losses. Therefore, how to organically and concisely achieve the fusion of the two has become a universally recognized challenge.…”

Section: Approachmentioning

confidence: 99%

See 1 more Smart Citation

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

Wang,

Jing,

Fan

et al. 2024

Sensors

View full text Add to dashboard Cite

In recent years, the rapid prevalence of high-definition video in Internet of Things (IoT) systems has been directly facilitated by advances in imaging sensor technology. To adapt to limited uplink bandwidth, most media platforms opt to compress videos to bitrate streams for transmission. However, this compression often leads to significant texture loss and artifacts, which severely degrade the Quality of Experience (QoE). We propose a latent feature diffusion model (LFDM) for compressed video quality enhancement, which comprises a compact edge latent feature prior network (ELPN) and a conditional noise prediction network (CNPN). Specifically, we first pre-train ELPNet to construct a latent feature space that captures rich detail information for representing sharpness latent variables. Second, we incorporate these latent variables into the prediction network to iteratively guide the generation direction, thus resolving the problem that the direct application of diffusion models to temporal prediction disrupts inter-frame dependencies, thereby completing the modeling of temporal correlations. Lastly, we innovatively develop a Grouped Domain Fusion module that effectively addresses the challenges of diffusion distortion caused by naive cross-domain information fusion. Comparative experiments on the MFQEv2 benchmark validate our algorithm’s superior performance in terms of both objective and subjective metrics. By integrating with codecs and image sensors, our method can provide higher video quality.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Approachmentioning

confidence: 99%

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

Wang,

Jing,

Fan

et al. 2024

Sensors

View full text Add to dashboard Cite

show abstract

“…Recently, SR approaches using diffusion techniques have been proposed. For instance, Han et al [50] used diffusion to create detailed super-resolved images and used feature distillation to reduce inference time. Wu et al [51] used diffusion together with contrastive learning to estimate the degradation kernels of images, without making assumptions about the kernels.…”

Section: Super-resolutionmentioning

confidence: 99%

AutoSR4EO: An AutoML Approach to Super-Resolution for Earth Observation Images

Wąsala,

Marselis,

Arp

et al. 2024

Remote Sensing

View full text Add to dashboard Cite

Super-resolution (SR), a technique to increase the resolution of images, is a pre-processing step in the pipelines of applications of Earth observation (EO) data. The manual design and optimisation of SR models that are specific to every possible EO use case is a laborious process that creates a bottleneck for EO analysis. In this work, we develop an automated machine learning (AutoML) method to automate the creation of dataset-specific SR models. AutoML is the study of the automatic design of high-performance machine learning models. We present the following contributions. (i) We propose AutoSR4EO, an AutoML method for automatically constructing neural networks for SR. We design a search space based on state-of-the-art residual neural networks for SR and incorporate transfer learning. Our search space is extendable, making it possible to adapt AutoSR4EO to future developments in the field. (ii) We introduce a new real-world single-image SR (SISR) dataset, called SENT-NICFI. (iii) We evaluate the performance of AutoSR4EO on four different datasets against the performance of four state-of-the-art baselines and a vanilla AutoML SR method, with AutoSR4EO achieving the highest average ranking. Our results show that AutoSR4EO performs consistently well over all datasets, demonstrating that AutoML is a promising method for improving SR techniques for EO images.

show abstract

“…However, currently, diffusion models are primarily used in the field of image generation. Unlike discriminative models, which can easily compute the correlation between predicted results and ground truth, there are only a few generative tasks that have welldefined ground truth, such as image super-resolution [23]. Currently, there is no research on incorporating the learning capability of diffusion models into crack detection.…”

Section: Introductionmentioning

confidence: 99%

The Crack Diffusion Model: An Innovative Diffusion-Based Method for Pavement Crack Detection

Zhang,

Chen,

et al. 2024

Remote Sensing

View full text Add to dashboard Cite

Pavement crack detection is of significant importance in ensuring road safety and smooth traffic flow. However, pavement cracks come in various shapes and forms which exhibit spatial continuity, and algorithms need to adapt to different types of cracks while preserving their continuity. To address these challenges, an innovative crack detection framework, CrackDiff, based on the generative diffusion model, is proposed. It leverages the learning capabilities of the generative diffusion model for the data distribution and latent spatial relationships of cracks across different sample timesteps and generates more accurate and continuous crack segmentation results. CrackDiff uses crack images as guidance for the diffusion model and employs a multi-task UNet architecture to predict mask and noise simultaneously at each sampling step, enhancing the robustness of generations. Compared to other models, CrackDiff generates more accurate and stable results. Through experiments on the Crack500 and DeepCrack pavement datasets, CrackDiff achieves the best performance (F1 = 0.818 and mIoU = 0.841 on Crack500, and F1 = 0.841 and mIoU = 0.862 on DeepCrack).

show abstract

Enhancing Remote Sensing Image Super-Resolution with Efficient Hybrid Conditional Diffusion Model

Cited by 17 publications

References 46 publications

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

AutoSR4EO: An AutoML Approach to Super-Resolution for Earth Observation Images

The Crack Diffusion Model: An Innovative Diffusion-Based Method for Pavement Crack Detection

Contact Info

Product

Resources

About