Real-time 3D scene layout from a single image using Convolutional Neural Networks

Yang, Shichao; Maturana, Daniel; Scherer, Sebastian

doi:10.1109/icra.2016.7487368

Cited by 20 publications

(6 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The authors use different scales to refine their predictions significantly. According to Yang et al [28], this architecture starts with a low-resolution, rough output prediction and refines it by fusing with prior layers to give both local and global reasoning. VGG-16 is introduced as a convolutional and max pooling layer-based encoder block in this case, as shown in Figure 1.…”

Section: Network Architecturementioning

confidence: 99%

“…Network parameters are optimized for binary cross-entropy via one of the SGD variants' parameter learning strategies. In contrast to the suggestion made by Yang et al [28], our network's capabilities were increased through data augmentation, in which the original versions of the training images are used. The training process is sped signifi cantly thanks to the extensive use of transfer learning.…”

Section: Network Trainingmentioning

confidence: 99%

“…Additionally, practically eliminating noise will considerably improve the quality of the prediction. The segmentation noise filtering method is finalized using the probabilistic models described in [28]. Network parameters are optimized for binary cross-entropy via one of the SGD variants' parameter learning strategies.…”

Section: Network Trainingmentioning

confidence: 99%

See 2 more Smart Citations

Obstacle Avoidance Strategy for Mobile Robot Based on Monocular Camera

Dang

Bui

2023

Electronics

View full text Add to dashboard Cite

This research paper proposes a real-time obstacle avoidance strategy for mobile robots with a monocular camera. The approach uses a binary semantic segmentation FCN-VGG-16 to extract features from images captured by the monocular camera and estimate the position and distance of obstacles in the robot’s environment. Segmented images are used to create the frontal view of a mobile robot. Then, the optimized path planning based on the enhanced A* algorithm with a set of weighted factors, such as collision, path, and smooth cost improves the performance of a mobile robot’s path. In addition, a collision-free and smooth obstacle avoidance strategy will be devised by optimizing the cost functions. Lastly, the results of our evaluation show that the approach successfully detects and avoids static and dynamic obstacles in real time with high accuracy, efficiency, and smooth steering with low angle changes. Our approach offers a potential solution for obstacle avoidance in both global and local path planning, addressing the challenges of complex environments while minimizing the need for expensive and complicated sensor systems.

show abstract

Section: Network Architecturementioning

confidence: 99%

Section: Network Trainingmentioning

confidence: 99%

See 1 more Smart Citation

Obstacle Avoidance Strategy for Mobile Robot Based on Monocular Camera

Dang

Bui

2023

Electronics

View full text Add to dashboard Cite

show abstract

“…To detect planes in images, model fitting algorithms, such as RANSAC, are employed. It is also possible to combine modeling and a convolutional neural network (CNN) to identify planes, such as walls, in an image [136]. As for high-level features, several techniques were proposed for detecting objects and semantically labeling them in images including, but not limited to, conditional random fields (CRFs) [51], support vector machines (SVMs) [30], and deep neural networks (for example: single shot multi-box detector [74] and you only look once (YOLO) [104]).…”

Section: Data Associationmentioning

confidence: 99%

Feature-based visual simultaneous localization and mapping: a survey

Azzam

Taha²,

Huang

et al. 2020

SN Appl. Sci.

View full text Add to dashboard Cite

Visual simultaneous localization and mapping (SLAM) has attracted high attention over the past few years. In this paper, a comprehensive survey of the state-of-the-art feature-based visual SLAM approaches is presented. The reviewed approaches are classified based on the visual features observed in the environment. Visual features can be seen at different levels; low-level features like points and edges, middle-level features like planes and blobs, and high-level features like semantically labeled objects. One of the most critical research gaps regarding visual SLAM approaches concluded from this study is the lack of generality. Some approaches exhibit a very high level of maturity, in terms of accuracy and efficiency. Yet, they are tailored to very specific environments, like feature-rich and static environments. When operating in different environments, such approaches experience severe degradation in performance. In addition, due to software and hardware limitations, guaranteeing a robust visual SLAM approach is extremely challenging. Although semantics have been heavily exploited in visual SLAM, understanding of the scene by incorporating relationships between features is not yet fully explored. A detailed discussion of such research challenges is provided throughout the paper.

show abstract

“…The generation of 3D floor plans is usually performed by extracting primitive shapes such as cuboids or planes from data acquired with on board cameras [6], [7] and/or range sensors [8], [9]. The limitations of current techniques involve (i) the requirement of a large amount of input data (e.g.…”

Section: Introductionmentioning

confidence: 99%

Sigma-FP: Robot Mapping of 3D Floor Plans With an RGB-D Camera Under Uncertainty

Matez-Bandera

Monroy

González-Jiménez

2022

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

This work presents Sigma-FP, a novel 3D reconstruction method to obtain the floor plan of a multi-room environment from a sequence of RGB-D images captured by a wheeled mobile robot. For each input image, the planar patches of visible walls are extracted and subsequently characterized by a multivariate Gaussian distribution in the convenient Plane Parameter Space. Then, accounting for the probabilistic nature of the robot localization, we transform and combine the planar patches from the camera frame into a 3D global model, where the planar patches include both the plane estimation uncertainty and the propagation of the robot pose uncertainty. Additionally, processing depth data, we detect openings (doors and windows) in the wall, which are also incorporated in the 3D global model to provide a more realistic representation. Experimental results, in both real-world and synthetic environments, demonstrate that our method outperforms state-of-theart methods, both in time and accuracy, while just relying on Atlanta world assumption.

show abstract

Real-time 3D scene layout from a single image using Convolutional Neural Networks

Cited by 20 publications

References 19 publications

Obstacle Avoidance Strategy for Mobile Robot Based on Monocular Camera

Obstacle Avoidance Strategy for Mobile Robot Based on Monocular Camera

Feature-based visual simultaneous localization and mapping: a survey

Sigma-FP: Robot Mapping of 3D Floor Plans With an RGB-D Camera Under Uncertainty

Contact Info

Product

Resources

About