2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00074
|View full text |Cite
|
Sign up to set email alerts
|

SASIC: Stereo Image Compression with Latent Shifts and Stereo Attention

Abstract: In this paper, we present ECSIC, a novel learned method for stereo image compression. Our proposed method compresses the left and right images in a joint manner by exploiting the mutual information between the images of the stereo image pair using a novel stereo cross attention (SCA) module and two stereo context modules. The SCA module performs cross-attention restricted to the corresponding epipolar lines of the two images and processes them in parallel. The stereo context modules improve the entropy estimat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 42 publications
0
8
0
Order By: Relevance
“…In this case, the PSNR with (without) the JCT module at the encoder (decoder) improves (drops) by about 0.16dB (0.73dB) at the same bpp level. We further report the compression results when the JCT module is directly replaced by other inter-view fusion operations such as concatenation in Mital et al (2022b), stereo attention module (SAM) in Wödlinger et al (2022) and bi-directional contextual transform module (Bi-CTM) in Lei et al (2022). These operations lead to an increase of the bitrate by 32.73%, 27.99%, 10.11% compared with our method.…”
Section: Ablation Studymentioning
confidence: 93%
See 4 more Smart Citations
“…In this case, the PSNR with (without) the JCT module at the encoder (decoder) improves (drops) by about 0.16dB (0.73dB) at the same bpp level. We further report the compression results when the JCT module is directly replaced by other inter-view fusion operations such as concatenation in Mital et al (2022b), stereo attention module (SAM) in Wödlinger et al (2022) and bi-directional contextual transform module (Bi-CTM) in Lei et al (2022). These operations lead to an increase of the bitrate by 32.73%, 27.99%, 10.11% compared with our method.…”
Section: Ablation Studymentioning
confidence: 93%
“…(2) Joint model has access to a set of multi-view images and explicitly utilizes the inter-view redundancy to achieve a high compression ratio. According to performance comparisons in Wödlinger et al (2022), conventional video standards can be applied in the MIC, where each set of multi-view images is compressed as a multi-frame video sequence by using both HEVC (Sullivan et al, 2012) and VVC (Bross et al, 2021) with lowdelay P configuration as well as YUV444 input format. We also test MV-HEVC (Tech et al, 2015) with the multi-view intra mode.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations