2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.353
|View full text |Cite
|
Sign up to set email alerts
|

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

Abstract: Semantic image segmentation is an essential component of modern autonomous driving systems, as an accurate understanding of the surrounding scene is crucial to navigation and action planning. Current state-of-the-art approaches in semantic image segmentation rely on pretrained networks that were initially developed for classifying images as a whole. While these networks exhibit outstanding recognition performance (i.e., what is visible?), they lack localization accuracy (i.e., where precisely is something loca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
333
0
9

Year Published

2017
2017
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 528 publications
(343 citation statements)
references
References 51 publications
1
333
0
9
Order By: Relevance
“…Unlike typical CNNs, which consist of repeated multiple layers, such as convolutional layers, pooling layers, the activation function, and fully connected layers, deep convolutional encoderdecoder networks employ two separate deep neural networks: an encoder network and a decoder network. This network has been successfully applied to many visual segmentation tasks, such as scene understanding (Kendall, Badrinarayanan, & Cipolla, 2015), autonomous driving (Teichmann, Weber, Zoellner, Cipolla, & Urtasun, 2016), biomedical image segmentation (Ronneberger, Fischer, & Brox, 2015), and semantic segmentation (Badrinarayanan, Kendall, & Cipolla, 2017;Pohlen, Hermans, Mathias, & Leibe, 2017).…”
Section: Encoder-decoder Network For Semantic Segmentationmentioning
confidence: 99%
“…Unlike typical CNNs, which consist of repeated multiple layers, such as convolutional layers, pooling layers, the activation function, and fully connected layers, deep convolutional encoderdecoder networks employ two separate deep neural networks: an encoder network and a decoder network. This network has been successfully applied to many visual segmentation tasks, such as scene understanding (Kendall, Badrinarayanan, & Cipolla, 2015), autonomous driving (Teichmann, Weber, Zoellner, Cipolla, & Urtasun, 2016), biomedical image segmentation (Ronneberger, Fischer, & Brox, 2015), and semantic segmentation (Badrinarayanan, Kendall, & Cipolla, 2017;Pohlen, Hermans, Mathias, & Leibe, 2017).…”
Section: Encoder-decoder Network For Semantic Segmentationmentioning
confidence: 99%
“…30 . The beneficial implementation of residual learning has been thoroughly documented in not only image classification and segmentation 31 but also in areas of speech recognition 32 . Fullyconvolutional networks 33 , or networks designed such that input of any spatial dimensionality can be analyzed with no loss in performance, offer enormous benefit to problems where 1) prior knowledge of input size is inherently variable and 2) the experimental data of interest is memory exhaustive.…”
Section: D-cnn Architecture Training and Validationmentioning
confidence: 99%
“…To illustrate this dilemma, Fig. 1 gives the accuracy (mIoU) and inference speed (frames per second (fps)) obtained by several state-of-the-art methods, including FCN-8s [9], CRF-RNN [17], DeepLab [10], DeepLabv2 [12], DeepLabv3+ [13], ResNet-38 [18], PSPNet [11], DUC [19], RefineNet [20], LRR [21], DPN [22], FRRN [23], TwoColumn [24], SegNet [25], SQNet [26], ENet [27], arXiv:2003.08736v2 [cs.CV] 3 Apr 2020 ERFNet [28], ICNet [29], SwiftNetRN [30], LEDNet [31], BiSeNet1 [32], BiSeNet2 [32], DFANet [33] and our proposed method, on the Cityscapes test dataset. Clearly, how to achieve a good tradeoff between accuracy and speed is still a challenging problem.…”
Section: Introductionmentioning
confidence: 99%