2018
DOI: 10.1016/j.patcog.2018.07.020
|View full text |Cite
|
Sign up to set email alerts
|

MoE-SPNet: A mixture-of-experts scene parsing network

Abstract: Scene parsing is an indispensable component in understanding the semantics within a scene. Traditional methods rely on handcrafted local features and probabilistic graphical models to incorporate local and global cues. Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing. An im-2012 and SceneParse150 based on two kinds of baseline models FCN-8s and DeepLab-ASPP.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(4 citation statements)
references
References 61 publications
0
4
0
Order By: Relevance
“…Yuan et al [32] propose the pyramid object context to model the category dependencies. Fu et al [33] propose a mixture-of-experts based scene parsing network that incorporates a convolutional mixture-of-experts layer to assess the importance of features from different levels. Zhao et al [34] propose the point-wise spatial attention network to relax the local neighborhood constraint.…”
Section: B Fcn Based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Yuan et al [32] propose the pyramid object context to model the category dependencies. Fu et al [33] propose a mixture-of-experts based scene parsing network that incorporates a convolutional mixture-of-experts layer to assess the importance of features from different levels. Zhao et al [34] propose the point-wise spatial attention network to relax the local neighborhood constraint.…”
Section: B Fcn Based Methodsmentioning
confidence: 99%
“…FCN backbone Pixel Acc.% Mean IoU% FCN8s [4] VGG-16 71.32 29.39 FCN8s [4] ResNet-50 74.57 34.38 FCN8s [4] ResNet-101 74.83 34.62 DilatedNet [5] VGG-16 73.55 32.31 SegNet [7] VGG-16 71.00 21.64 CascadeNet [3] VGG-16 74.52 34.90 RefineNet [46] ResNet-101 -40.20 RefineNet [46] ResNet-152 -40.71 PSPNet [8] ResNet-50 80.04 41.68 PSPNet [8] ResNet-101 81.39 43.29 PSPNet [8] ResNet-269 81.69 44.94 SAC [48] ResNet-101 81.86 44.30 DeepLabv2 [9] VGG-16 74.88 33.03 DeepLabv2 [9] ResNet-101 77.31 36.75 MoE-SPNet [33] VGG-16 75.50 34.35 MoE-SPNet [33] ResNet-101 78.02 37.89 EncNet [27] ResNet-50 79.73 41.11 EncNet [27] ResNet-101 81.69 44.65 DSSPN [58] VGG-16 76.04 34.56 DSSPN [58] ResNet-101 81.13 43.68 PSANet [34] ResNet-50 80.92 42.97 PSANet [34] ResNet-101 81.51 43.77 UperNet [59] ResNet-50 79.98 41.22 GANet [35] VGG-16 78.64 41.52 GANet [35] ResNet-50 81.80 44.71 GANet [35] ResNet-101 82.14 45.36 WResNet [53] ResNet-101 81.17 43.73 OCNet [32] ResNet-50 80.44 42.63 OCNet [32] ResNet-101 81.86 45.08 Fast-OCNet [32] ResNet jects, and the labels are inaccurate with ambiguity annotations. Thus, it mimicks a more natural object occurrence in daily scenes.…”
Section: Methodsmentioning
confidence: 99%
“…MoE aims to learn a system composed of many separated networks (experts), where each expert learns to handle a subset of the whole dataset. Recently, deep MoE methods have shown their superiority in image recognition [1,20,51], machine translation [44], scene parsing [14] and so on. Unlike these works, we design a learnable voting network that can be updated with a novel meta-learning algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…Most deep learning based semantic segmentation methods are designed for RGB single-modality data, achieving prominent performance on many challenging large-scale datasets [11,12]. However, the RGB imaging sensors are highly sensitive to light and therefore susceptible to adverse lighting conditions, like darkness or overexposure.…”
Section: Introductionmentioning
confidence: 99%