Deep Video Super-Resolution Using HR Optical Flow Estimation

Wang, Longguang; Guo, Yulan; Liu, Li; Lin, Zaiping; Deng, Xinpu; An, Wei

doi:10.1109/tip.2020.2967596

Cited by 150 publications

(107 citation statements)

References 55 publications

Supporting

Mentioning

107

Contrasting

Order By: Relevance

“…1) Video SR: Liao et al [23] proposed the first CNNbased video SR method by performing motion compensation and SR sequentially. Wang et al [24], [25] improved this scheme by estimating optical flow in HR space. Tao et al [26] proposed a joint learning framework for motion estimation and SR reconstruction.…”

Section: B Multi-image Srmentioning

confidence: 99%

Learning Parallax Attention for Stereo Image Super-Resolution

Wang

Liang³

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

257

228

View full text Add to dashboard Cite

Stereo image pairs can be used to improve the performance of super-resolution (SR) since additional information is provided from a second viewpoint. However, it is challenging to incorporate this information for SR since disparities between stereo images vary significantly. In this paper, we propose a parallax-attention stereo superresolution network (PASSRnet) to integrate the information from a stereo image pair for SR. Specifically, we introduce a parallax-attention mechanism with a global receptive field along the epipolar line to handle different stereo images with large disparity variations. We also propose a new and the largest dataset for stereo image SR (namely, Flickr1024). Extensive experiments demonstrate that the parallax-attention mechanism can capture correspondence between stereo images to improve SR performance with a small computational and memory cost. Comparative results show that our PASSRnet achieves the state-of-the-art performance on the Middlebury, KITTI 2012 and KITTI 2015 datasets.

show abstract

Section: B Multi-image Srmentioning

confidence: 99%

Learning Parallax Attention for Stereo Image Super-Resolution

Wang

Liang³

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

257

228

View full text Add to dashboard Cite

show abstract

“…In [34], a multi-memory network was proposed to extract temporal information from consecutive frames with several ConvLSTM layers [49]. In order to avoid the resolution conflict between LR optical flows and HR outputs, Wang et al [35] designed a novel VSR network to estimate HR optical flow in a coarseto-fine manner. Li et al [36] proposed a motion compensation network with a pyramid structure and adopted channel and spatial attention mechanism in the following reconstruction network.…”

Section: Video Super-resolution With Cnnsmentioning

confidence: 99%

Video super‐resolution with non‐local alignment network

Zhou

Chen

Ding

et al. 2021

IET Image Processing

View full text Add to dashboard Cite

Video super-resolution (VSR) aims at recovering high-resolution frames from their lowresolution counterparts. Over the past few years, deep neural networks have dominated the video super-resolution task because of its strong non-linear representational ability. To exploit temporal correlations, most deep neural networks have to face two challenges: (1) how to align consecutive frames containing motions, occlusions and blurring, and establish accurate temporal correspondences, (2) how to effectively fuse aligned frames and balance their contributions. In this work, a novel video super-resolution network, named NLVSR, is proposed to solve above problems in an efficient and effective manner. For alignment, a temporal-spatial non-local operation is employed to align each frame to the reference frame. Compared with existing alignment approaches, the proposed temporalspatial non-local operation is able to integrate the global information of each frame by a weighted sum, leading to a better performance in alignment. For fusion, an attentionbased progressive fusion framework was designed to integrate aligned frames gradually. To penalize the points with low-quality in aligned features, an attention mechanism was employed for a robust reconstruction. Experimental results demonstrate the superiority of the proposed network in terms of quantitative and qualitative evaluation, and surpasses other state-of-the-art methods by 0.33 dB at least. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

show abstract

“…Yi et al [40] utilized optical flow for simple motion compensation and proposed an ultra dense memory block (UDMB), which combines hierarchical features to improve performance. Wang et al [41] considered that there is a gap between the LR optical flows and the HR outputs, which could result in loss of details. Thus, they proposed super-resolve optical flows for video SR (SOF-VSR), which predicts the optical flows at highresolution scales to reduce the displacement.…”

Section: B Video Super-resolutionmentioning

confidence: 99%

Local-Global Fusion Network for Video Super-Resolution

Wang

Jin

et al. 2020

IEEE Access

View full text Add to dashboard Cite

The goal of video super-resolution technique is to address the problem of effectively restoring high-resolution (HR) videos from low-resolution (LR) ones. Previous methods commonly used optical flow to perform frame alignment and designed a framework from the perspective of space and time. However, inaccurate optical flow estimation may occur easily which leads to inferior restoration effects. In addition, how to effectively fuse the features of various video frames remains a challenging problem. In this paper, we propose a Local-Global Fusion Network (LGFN) to solve the above issues from a novel viewpoint. As an alternative to optical flow, deformable convolutions (DCs) with decreased multi-dilation convolution units (DMDCUs) are applied for efficient implicit alignment. Moreover, a structure with two branches, consisting of a Local Fusion Module (LFM) and a Global Fusion Module (GFM), is proposed to combine information from two different aspects. Specifically, LFM focuses on the relationship between adjacent frames and maintains the temporal consistency while GFM attempts to take advantage of all related features globally with a video shuffle strategy. Benefiting from our advanced network, experimental results on several datasets demonstrate that our LGFN can not only achieve comparative performance with state-of-the-art methods but also possess reliable ability on restoring a variety of video frames. The results on benchmark datasets of our LGFN are presented on https://github.com/BIOINSu/LGFN and the source code will be released as soon as the paper is accepted. INDEX TERMS Convolutional neural networks, deep learning, improved deformable convolution, localglobal feature fusion, video super-resolution.

show abstract

Deep Video Super-Resolution Using HR Optical Flow Estimation

Cited by 150 publications

References 55 publications

Learning Parallax Attention for Stereo Image Super-Resolution

Learning Parallax Attention for Stereo Image Super-Resolution

Video super‐resolution with non‐local alignment network

Local-Global Fusion Network for Video Super-Resolution

Contact Info

Product

Resources

About