2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00155
|View full text |Cite
|
Sign up to set email alerts
|

FVC: A New Framework towards Deep Video Compression in Feature Space

Abstract: Learning based video compression attracts increasing attention in the past few years. The previous hybrid coding approaches rely on pixel space operations to reduce spatial and temporal redundancy, which may suffer from inaccurate motion estimation or less effective motion compensation. In this work, we propose a feature-space video coding network (FVC) by performing all major operations (i.e., motion estimation, motion compression, motion compensation and residual compression) in the feature space. Specifical… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
113
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 183 publications
(114 citation statements)
references
References 31 publications
1
113
0
Order By: Relevance
“…For the traditional video codecs [52] [53], different linear transformations are exploited to better capture the statistical characteristics of the texture and motion information within the videos. Latter, learnable video codecs [54] [55] [56] [57] [58] [59] [60] gain increasing attention. Following the traditional hybrid video compression framework, Lu et al [54] proposed the first endto-end optimized video compression framework, in which all the key components in H.264/H.265 are replaced with deep neural networks.…”
Section: Video Compressionmentioning
confidence: 99%
“…For the traditional video codecs [52] [53], different linear transformations are exploited to better capture the statistical characteristics of the texture and motion information within the videos. Latter, learnable video codecs [54] [55] [56] [57] [58] [59] [60] gain increasing attention. Following the traditional hybrid video compression framework, Lu et al [54] proposed the first endto-end optimized video compression framework, in which all the key components in H.264/H.265 are replaced with deep neural networks.…”
Section: Video Compressionmentioning
confidence: 99%
“…[24,49], which use 3D convolution architectures, and Refs. [3,9,11,12,16,23,26,33,36,37,48,50,52,53,69], which model P-frames as an optical flow field applied to the previous frame plus a residual model.…”
Section: Related Workmentioning
confidence: 99%
“…first estimate of the current frame. In addition to the optical flow, pixel-level [3,37] or feature-level [26] residuals are compressed. The final prediction for the frame is given by applying the optical flow field to the previous reconstructed frame in an operation known as warping or motion compensation and adding the residuals.…”
Section: Implicit Video Representationsmentioning
confidence: 99%
“…Most previous works focus on the low-delay P configuration [6,7] (used for videoconferencing) and omit the Random Access configuration (used for streaming at large). Furthermore, most neural codecs [10,11] are assessed using an I frame period shorter (e.g. 10 or 12 frames, regardless of the video framerate) than expected by the Common Test Conditions of modern video coders such as those defined for HEVC or VVC [12].…”
Section: Introduction and Related Workmentioning
confidence: 99%