A Computationally Efficient Semantic SLAM Solution for Dynamic Scenes

Wang, Zemin; Zhang, Qian; Li, Jiansheng; Zhang, Shuming; Liu, Jingbin

doi:10.3390/rs11111363

Cited by 46 publications

(28 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A multitude of different SLAM approaches were proposed based on the use of a combination of low-and high-level features in [10,26,37,48,71,105,106,118,130]. Such approaches demonstrate high-level expressiveness while maintaining robustness.…”

Section: Low-and High-level Feature-based Approachesmentioning

confidence: 99%

“…The remaining data image a static environment which can be processed using a standard visual SLAM algorithm. A similar approach can be found in [130] where dynamic objects are segmented out of the scene by means of a computationally efficient step-wise approach to detect the object and extract its contour. The static environment is then mapped based on point features using a novel look-up table approach that targets using a large amount of distinct, evenly-distributed point features from the environment, which enhances the accuracy of mapping and localization.…”

Section: Low-and High-level Feature-based Approachesmentioning

confidence: 99%

See 1 more Smart Citation

Feature-based visual simultaneous localization and mapping: a survey

Azzam

Taha²,

Huang

et al. 2020

SN Appl. Sci.

View full text Add to dashboard Cite

Visual simultaneous localization and mapping (SLAM) has attracted high attention over the past few years. In this paper, a comprehensive survey of the state-of-the-art feature-based visual SLAM approaches is presented. The reviewed approaches are classified based on the visual features observed in the environment. Visual features can be seen at different levels; low-level features like points and edges, middle-level features like planes and blobs, and high-level features like semantically labeled objects. One of the most critical research gaps regarding visual SLAM approaches concluded from this study is the lack of generality. Some approaches exhibit a very high level of maturity, in terms of accuracy and efficiency. Yet, they are tailored to very specific environments, like feature-rich and static environments. When operating in different environments, such approaches experience severe degradation in performance. In addition, due to software and hardware limitations, guaranteeing a robust visual SLAM approach is extremely challenging. Although semantics have been heavily exploited in visual SLAM, understanding of the scene by incorporating relationships between features is not yet fully explored. A detailed discussion of such research challenges is provided throughout the paper.

show abstract

Section: Low-and High-level Feature-based Approachesmentioning

confidence: 99%

Section: Low-and High-level Feature-based Approachesmentioning

confidence: 99%

Feature-based visual simultaneous localization and mapping: a survey

Azzam

Taha²,

Huang

et al. 2020

SN Appl. Sci.

View full text Add to dashboard Cite

show abstract

“…By applying the classic semantic segmentation networks, such as YOLO [8], SSD [9], Seg-Net [10], Mask-RCNN [11], PSPNet [12], and Deeplab [13], the semantic labels of the extracted image features in visual SLAM framework can be obtained. When the objects in the image are recognized as movable objects, such as people, cat, and car, the features located on these objects are thought as dynamic features and will be directly removed [14]- [16] or further processed through a selective tracking method in the tracking thread of SLAM [17] to determine whether they are retained or removed. The idea of using semantic information to detect dynamic feature points is very simple and direct, but it also has some limitations, mainly including two aspects: first, the semantic dynamic feature points do not completely coincide with the actual dynamic ones; second, the semantic segmentation results have errors especially in the boundary region of objects.…”

Section: A Related Workmentioning

confidence: 99%

SDF-SLAM: Semantic Depth Filter SLAM for Dynamic Environments

Cui

2020

IEEE Access

View full text Add to dashboard Cite

Simultaneous Localization and Mapping (SLAM) has been widely applied in computer vision and robotics. For the dynamic environments which are very common in the real word, traditional visual SLAM system faces significant drop in localization and mapping accuracy due to the static world assumption. Recently, the semantic visual SLAM systems towards dynamic scenes have gradually attracted more and more attentions, which use the semantic information of images to help remove dynamic feature points. Existing semantic visual SLAM systems commonly detect the dynamic feature points by the semantic prior, geometry constraint or the combine of them, then map points corresponding to dynamic feature points are removed. In the visual SLAM framework, pose calculation is essentially around the 3D map points, so the essence of improving the accuracy of visual SLAM system is to build a more accurate and reliable map. These existing semantic visual SLAM systems are actually adopting an indirect way to acquire reliable map points, and several drawbacks exist. In this paper, we present SDF-SLAM: Semantic Depth Filter SLAM, a visual semantic SLAM system towards dynamic environments, which utilizes the technology of depth filter to directly judge whether a 3D map point is dynamic or not. First, the semantic information is integrated into the original pure geometry SLAM system by the semantic optical flow method to perform reliable map initialization. Second, design the semantic depth filter that satisfies the Gaussian Uniform mixture distribution to describe the inverse depth of each map point. Third, updating the inverse depth of 3D map point in a Bayesian estimation framework, and dividing the 3D map point into active one or inactive one. Last, only the active map points are utilized to achieve robust camera pose tracking. Experiments on TUM dataset demonstrate that our approach outperforms original ORB-SLAM2 and other state-of-the-art semantic SLAM systems.

show abstract

“…Zhang et al [22] use YOLO to get semantic message, they consider features which are always located on the moving objects as unstable and filter them out. Wang et al [23] propose a step-wise approach that consists of object detection and contour extraction to extract semantic information of dynamic objects in a more computationally efficient way. Xiao et al [24] use SSD object detection network running in a separate thread to get prior knowledge about dynamic objects, and the features on dynamic objects are then processed through a selective tracking algorithm in the tracking thread, to significantly reduce the error of pose estimation.…”

Section: A Related Workmentioning

confidence: 99%

SOF-SLAM: A Semantic Visual SLAM for Dynamic Environments

Cui

2019

IEEE Access

117

View full text Add to dashboard Cite

Simultaneous Localization and Mapping (SLAM) plays an important role in the computer vision and robotics field. The traditional SLAM framework adopts a strong static world assumption for analysis convenience. How to cope with dynamic environments is of vital importance and attracts more attentions. Existing SLAM systems toward dynamic scenes either solely utilize semantic information, solely utilize geometry information, or naively combine the results from them in a loosely coupled way. In this paper, we present SOF-SLAM: Semantic Optical Flow SLAM, a visual semantic SLAM system toward dynamic environments, which is built on RGB-D mode of ORB-SLAM2. A new dynamic features detection approach called semantic optical flow is proposed, which is a kind of tightly coupled way and can fully take advantage of feature's dynamic characteristic hidden in semantic and geometry information to remove dynamic features effectively and reasonably. The pixel-wise semantic segmentation results generated by SegNet serve as mask in the proposed semantic optical flow to get a reliable fundamental matrix, which is then used to filter out the truly dynamic features. Only the remaining static features are reserved in the tracking and optimization module to achieve accurate camera pose estimation in dynamic environments. Experiments on public TUM RGB-D dataset and in real-world environment are conducted. Compared with ORB-SLAM2, the proposed SOF-SLAM achieves averagely 96.73% improvements in high-dynamic scenarios. It also outperforms the other four state-of-the-art SLAM systems which cope with the dynamic environments.

show abstract

A Computationally Efficient Semantic SLAM Solution for Dynamic Scenes

Cited by 46 publications

References 41 publications

Feature-based visual simultaneous localization and mapping: a survey

Feature-based visual simultaneous localization and mapping: a survey

SDF-SLAM: Semantic Depth Filter SLAM for Dynamic Environments

SOF-SLAM: A Semantic Visual SLAM for Dynamic Environments

Contact Info

Product

Resources

About