This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data. In addition to the known concept of disparity-compensated prediction, inter-view motion parameter, and inter-view residual prediction for coding of the dependent video views are developed and integrated. Furthermore, for depth coding, new intra coding modes, a modified motion compensation and motion vector coding as well as the concept of motion parameter inheritance are part of the HEVC extension. A novel encoder control uses view synthesis optimization, which guarantees that high quality intermediate views can be generated based on the decoded data. The bitstream format supports the extraction of partial bitstreams, so that conventional 2D video, stereo video, and the full multi-view video plus depth format can be decoded from a single bitstream. Objective and subjective results are presented, demonstrating that the proposed approach provides 50% bit rate savings in comparison with HEVC simulcast and 20% in comparison with a straightforward multi-view extension of HEVC without the newly developed coding tools.
In this paper, we describe a video coding design that enables a higher coding efficiency than the HEVC standard. The proposed video codec follows the design of block-based hybrid video coding, but includes a number of advanced coding tools. A part of the incorporated advanced concepts was developed by the Joint Video Exploration Team, while others are newly proposed. The key aspects of these newly proposed tools are the following. A video frame is subdivided into rectangles of variable size using a binary partitioning with variable split ratios. Three new approaches for generating spatial intra prediction signals are supported: A line-wise application of conventional intra prediction modes, coupled with a mode-dependent processing order, a region-based template matching prediction method and intra prediction modes based on neural networks. For motion-compensated prediction, a multi-hypothesis mode with more than two motion hypotheses can be used. In transform coding, mode dependent combinations of primary and secondary transforms are applied. Moreover, scalar quantization is replaced by trellis-coded quantization and the entropy coding of the quantized transform coefficients is improved. The intra and inter prediction signals can be filtered using an edge-preserving diffusion filter or a non-linear DCT-based thresholding operation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.