2021
DOI: 10.1016/j.displa.2021.102082
|View full text |Cite
|
Sign up to set email alerts
|

RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 22 publications
0
6
0
Order By: Relevance
“…To validate the effectiveness of the proposed model in this paper, we compare the proposed method with state-of-the-arts methods (ESANet [24], IEMNet [60], SGACNet [61], Z-ACN [62], DynMM [63], RDFNet [7], RAFNet [64], SA-Gate [65], RedNet [8], ACNet [23], SGNet [27], ShapeConv [66]) on the NYU-Depth V2 dataset. For a fair comparison, we compare our method with others using the ResNet architecture, which employ ResNet with varying depths and quantities.…”
Section: Quantitative Experimental Results On Nyu-depth V2 and Sun Rg...mentioning
confidence: 99%
“…To validate the effectiveness of the proposed model in this paper, we compare the proposed method with state-of-the-arts methods (ESANet [24], IEMNet [60], SGACNet [61], Z-ACN [62], DynMM [63], RDFNet [7], RAFNet [64], SA-Gate [65], RedNet [8], ACNet [23], SGNet [27], ShapeConv [66]) on the NYU-Depth V2 dataset. For a fair comparison, we compare our method with others using the ResNet architecture, which employ ResNet with varying depths and quantities.…”
Section: Quantitative Experimental Results On Nyu-depth V2 and Sun Rg...mentioning
confidence: 99%
“…On the three evaluation metrics, CMANet achieves 74.2% pixel accuracy, 60.6% mean accuracy, and 47.6% mean IoU. Additionally, on the most important metric—mean IoU—CMANet achieves a 2.7% improvement compared to RefineNet-101 [ 45 ], 1.7% compared to LSD-GF [ 51 ], and 0.1% compared to RAFNet [ 43 ]. It is noticed that we only utilize ResNet-50 as our backbone, suggesting that CMANet is capable of better performance with a more powerful backbone.…”
Section: Methodsmentioning
confidence: 99%
“…Since the RGB and depth modalities must be fully utilized in RGB-D semantic segmentation, the cross-modality fusion is crucial. The fusion can be achieved via element-wise summation, concatenation, or a combination of both, and adapted by latent learning [ 4 , 13 , 14 , 18 , 19 , 41 , 42 , 43 ]. For learning common and specific parts from cross-modality features, ref.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The feature fusion channel is to fuse the feature map features generated by the third layer of the pyramid structure. Perform a down‐sampling selection on each denoised feature map (Ge et al, 2021; Maurya et al, 2021; Yan et al, 2021), so that their sizes are the same. Then use the feature fusion model to sum each feature image by element to form a new feature map.…”
Section: Proposed Methodsmentioning
confidence: 99%