Moving Picture Experts Group (MPEG) is developing a standard for immersive video coding called MPEG Immersive Video (MIV) and is releasing a reference software called Test Model for Immersive Video (TMIV) in the standardization process. The TMIV efficiently compresses an immersive video comprising a set of texture and depth views acquired using multiple cameras within a limited 3D viewing space. Moreover, it affords a rendered view of an arbitrary view position and orientation with six degrees of freedom (6DoF). However, the existing depth quantization applied to depth atlas in TMIV is insufficient since the reconstructed depth is crucial for achieving the required quality of a rendered viewport. To address this issue, we propose a nonlinear depth quantization method that allocates more codewords to a depth subrange with a higher occurrence of depth values located at edge regions, which are important in terms of the rendered view quality. We implement the proposed nonlinear quantization based on piecewise linear scaling considering the computational complexity and bitstream overhead. The experimental results show that the proposed method yields PSNR-based Bjøntegaard delta rate gains of 5.2% and 4.9% in the end-toend performance for High-and Low-bitrate (BR) ranges, respectively. Moreover, subjective quality improvement is mainly observed at the object boundaries of the rendered viewport. The proposed nonlinear quantization method has been adopted into the TMIV as a candidate standard technology for the next MIV edition.
This paper presents methods of Neural Network (NN) training reflecting block partitioning for Matrix-based Intra Prediction (MIP)-based networks. A training method using a dataset considering coding block partitioning leads to a NNbased predictor that is more suitable for a legacy block-based video codec compared to a training method that does not consider block partitioning. In addition, training using block partitioning of actual video encoding allows better intra prediction than a training method considering block partitioning in the training process. The MIP-based intra-prediction networks are implemented in VVC by replacing the MIP to evaluate the proposed training methods. The experimental results show that the proposed training method considering block partitioning of actual encoding gives the coding gain of 0.19% Bjøntegaard Delta (BD)-rate on average compared to training without considering block partitioning.
The emerging Versatile Video Coding (VVC) standard currently adopts Triangular Partitioning Mode (TPM) to make more flexible inter prediction. Due to the motion search and motion storage for TPM, the complexity of the encoder and decoder is significantly increased. This letter proposes two simplifications of TPM for reducing the complexity of the current design. One simplification is to reduce the number of combinations of motion vectors for both partitions to be checked. The method gives 4% encoding time decrease with negligible BD-rate loss. Another one is to remove the reference picture remapping process in the motion vector storage of TPM. It reduces the complexity of the encoder and decoder without a BD-rate change for the random-access configuration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.