ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413449
|View full text |Cite
|
Sign up to set email alerts
|

Mask4D: 4D Convolution Network for Light Field Occlusion Removal

Abstract: Current light field (LF) occlusion removal approaches usually select only a part of sub-aperture images (SAIs) or simply stack all SAIs to reconstruct the center view, which destroys the spatial layout of SAIs. In this paper, we present a simple yet effective LF occlusion removal method name Mask4D, which is a 4D convolution-based encoder-decoder network. We propose to keep the spatial layout of SAIs and construct all SAIs as a 5D input tensor to fully exploit the spatial connection information between SAIs. I… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…Apart from these aforementioned networks, many architectures have also been developed for LF image processing, such as bi-directional recurrent structure [41], 4D convolution [12], [55], [56] and cost volumes [28], [68]. The key difference between existing schemes and our disentangling mechanism is, we can fully use the information from all angular views and incorporate the LF structure prior.…”
Section: Cnn Architectures For Lf Image Processingmentioning
confidence: 99%
See 2 more Smart Citations
“…Apart from these aforementioned networks, many architectures have also been developed for LF image processing, such as bi-directional recurrent structure [41], 4D convolution [12], [55], [56] and cost volumes [28], [68]. The key difference between existing schemes and our disentangling mechanism is, we can fully use the information from all angular views and incorporate the LF structure prior.…”
Section: Cnn Architectures For Lf Image Processingmentioning
confidence: 99%
“…4D Convolution. Since several recent works [12], [55], [56] used 4D convolutions to handle LF data and have achieved promising performance, we compare our disentangling mechanism with 4D convolutions by replacing our Distg-Block with a series of 4D residual blocks. As shown in Table 1, stacking 4D convolutions can result in a very large model size (i.e., 22.3M for 2×SR) but cannot introduce performance improvements.…”
Section: Ablation Studymentioning
confidence: 99%
See 1 more Smart Citation
“…Unlike conventional imaging, which captures the 2D projection of light rays, LF imaging collects data with many dimensions [ 1 ]. This abundance of visual information in LF pictures, in addition to their immersive description of the real world, may help several image processing and computer vision tasks, such as depth estimation [ 2 , 3 ], de-occlusion [ 4 , 5 ], salient object detection [ 6 , 7 ], and image post-refocus [ 8 ].…”
Section: Introductionmentioning
confidence: 99%
“…These images contain spatial and angular information about the 3D scenes. As a result, many applications have developed and benefited greatly from this huge amount of information, such as de-occlusion [ 1 , 2 ], depth-sensing [ 3 , 4 , 5 ], saliency detection [ 6 ], and salient object detection [ 7 ]. In addition, LF could be promising to ease other applications such as the fruit-picking robot, where a robot traverses a whole field and harvests on its own [ 8 , 9 ].…”
Section: Introductionmentioning
confidence: 99%