2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2016
DOI: 10.1109/cvprw.2016.60
|View full text |Cite
|
Sign up to set email alerts
|

ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation

Abstract: We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
158
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 239 publications
(159 citation statements)
references
References 39 publications
0
158
0
1
Order By: Relevance
“…Extending this to 2 dimensions would make RCNs more suitable for processing images, and 3 dimensions could enable fast learning for video processing. Combining bi-directional (horizontal and vertical) scanning as done in this paper and in [36] does provide some of the benefits of 2D convolution, but we expect that such setup will prove to be sub-optimal for larger image sizes. Extensions that better mimic the local properties of 2D convolution are however non-trivial and, to our best knowledge, no such extensions have been proposed yet.…”
Section: Conclusion Discussion and Future Workmentioning
confidence: 99%
“…Extending this to 2 dimensions would make RCNs more suitable for processing images, and 3 dimensions could enable fast learning for video processing. Combining bi-directional (horizontal and vertical) scanning as done in this paper and in [36] does provide some of the benefits of 2D convolution, but we expect that such setup will prove to be sub-optimal for larger image sizes. Extensions that better mimic the local properties of 2D convolution are however non-trivial and, to our best knowledge, no such extensions have been proposed yet.…”
Section: Conclusion Discussion and Future Workmentioning
confidence: 99%
“…RNN models have also proven to be effective for tasks with densely connected data such as semantic segmentation [76], scene parsing [51] and even as an alternative to Convolutional Neural Networks [65]. These works show that RNN models are capable of learning the dependencies between spatially correlated data such as image pixels.…”
Section: Related Workmentioning
confidence: 99%
“…For example, Zuo et al [54] converted each image into 1D spatial sequences by concatenating the CNN features of different regions, and utilized RNN to learn the spatial dependencies of image regions. Similar work appeared in [46]. The proposed ReNet replaced the ubiquitous convolutional+pooling layer with four recurrent neural networks that sweep horizontally and vertically in both directions across the image.…”
Section: Usage Of Cnn-rnn Frameworkmentioning
confidence: 85%