2016 IEEE International Conference on Robotics and Automation (ICRA) 2016
DOI: 10.1109/icra.2016.7487368
|View full text |Cite
|
Sign up to set email alerts
|

Real-time 3D scene layout from a single image using Convolutional Neural Networks

Abstract: We consider the problem of understanding the 3D layout of indoor corridor scenes from a single image in real time. Identifying obstacles such as walls is essential for robot navigation, but also challenging due to the diversity in structure, appearance and illumination of real-world corridor scenes. Many current single-image methods make Manhattanworld assumptions, and break down in environments that do not meet this mold. They also may require complicated handdesigned features for image segmentation or clear … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(6 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…The authors use different scales to refine their predictions significantly. According to Yang et al [28], this architecture starts with a low-resolution, rough output prediction and refines it by fusing with prior layers to give both local and global reasoning. VGG-16 is introduced as a convolutional and max pooling layer-based encoder block in this case, as shown in Figure 1.…”
Section: Network Architecturementioning
confidence: 99%
See 2 more Smart Citations
“…The authors use different scales to refine their predictions significantly. According to Yang et al [28], this architecture starts with a low-resolution, rough output prediction and refines it by fusing with prior layers to give both local and global reasoning. VGG-16 is introduced as a convolutional and max pooling layer-based encoder block in this case, as shown in Figure 1.…”
Section: Network Architecturementioning
confidence: 99%
“…Network parameters are optimized for binary cross-entropy via one of the SGD variants' parameter learning strategies. In contrast to the suggestion made by Yang et al [28], our network's capabilities were increased through data augmentation, in which the original versions of the training images are used. The training process is sped signifi cantly thanks to the extensive use of transfer learning.…”
Section: Network Trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…To detect planes in images, model fitting algorithms, such as RANSAC, are employed. It is also possible to combine modeling and a convolutional neural network (CNN) to identify planes, such as walls, in an image [136]. As for high-level features, several techniques were proposed for detecting objects and semantically labeling them in images including, but not limited to, conditional random fields (CRFs) [51], support vector machines (SVMs) [30], and deep neural networks (for example: single shot multi-box detector [74] and you only look once (YOLO) [104]).…”
Section: Data Associationmentioning
confidence: 99%
“…The generation of 3D floor plans is usually performed by extracting primitive shapes such as cuboids or planes from data acquired with on board cameras [6], [7] and/or range sensors [8], [9]. The limitations of current techniques involve (i) the requirement of a large amount of input data (e.g.…”
Section: Introductionmentioning
confidence: 99%