Learning Monocular Visual Odometry with Dense 3D Mapping from Dense 3D Flow

Zhao, Cheng; Sun, Li; Purkait, Pulak; Duckett, Tom

doi:10.1109/iros.2018.8594151

Cited by 49 publications

(39 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The proposed approach can not only automatically learn effective feature representation, but also implicitly model sequential dynamics and relation for VO with the help of deep RNN. In [47] and [48], two approaches have been proposed to robust estimate the VO by considering the optical flow caused by the camera motion. In [48], the camera motion has been estimated by using the constraints with depth and optical flow.…”

Section: B Monocular-based Methodsmentioning

confidence: 99%

“…In [47] and [48], two approaches have been proposed to robust estimate the VO by considering the optical flow caused by the camera motion. In [48], the camera motion has been estimated by using the constraints with depth and optical flow. In [47], a novel network architecture for estimating monocular camera motion which is composed of two branches that jointly learn a latent space representation of the input optical flow field and the camera motion estimate.…”

Section: B Monocular-based Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Ground-Plane-Based Absolute Scale Estimation for Monocular Visual Odometry

Zhou

Dai

2020

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

Recovering absolute metric scale from a monocular camera is a challenging but highly desirable problem for monocular camera-based systems. By using different kinds of cues, various approaches have been proposed for scale estimation, such as camera height, object size etc. In this paper, firstly, we summarize different kinds of scale estimation approaches. Then, we propose a robust divide and conquer absolute scale estimation method based on the ground plane and camera height by analyzing the advantages and disadvantages of different approaches. By using the estimated scale, an effective scale correction strategy has been proposed to reduce the scale drift during the Monocular Visual Odometry (VO) estimation process. Finally, the effectiveness and robustness of the proposed method have been verified on both public and self-collected image sequences.

show abstract

Section: B Monocular-based Methodsmentioning

confidence: 99%

Section: B Monocular-based Methodsmentioning

confidence: 99%

Ground-Plane-Based Absolute Scale Estimation for Monocular Visual Odometry

Zhou

Dai

2020

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

show abstract

“…With the same result, Zou et al [60] jointly train for optical flow, pose and depth estimation simultaneously while Jiao et al [23] mutually improve semantics and depth and GeoNet [53] jointly estimates depth, optical flow and camera pose from video. Fully unsupervised monocular depth and visual odometry can also be entangled [58] and 3D mapping applications [57] are realized by heavily relying on dense optical flow in 2D and 3D. Despite the superiority of these approaches, they suffer from larger computational burden or come at the cost of additional training data.…”

Section: Monocular Visionmentioning

confidence: 99%

SteReFo: Efficient Image Refocusing with Stereo Vision

Busam

Hog

McDonagh

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

Whether to attract viewer attention to a particular object, give the impression of depth or simply reproduce humanlike scene perception, shallow depth of field images are used extensively by professional and amateur photographers alike. To this end, high quality optical systems are used in DSLR cameras to focus on a specific depth plane while producing visually pleasing bokeh. We propose a physically motivated pipeline to mimic this effect from all-in-focus stereo images, typically retrieved by mobile cameras. It is capable to change the focal plane a posteriori at 76 FPS on KITTI [13] images to enable realtime applications. As our portmanteau suggests, SteReFo interrelates stereo-based depth estimation and refocusing efficiently. In contrast to other approaches, our pipeline is simultaneously fully differentiable, physically motivated, and agnostic to scene content. It also enables computational video focus tracking for moving objects in addition to refocusing of static images. We evaluate our approach on publicly available datasets [13,33,9] and quantify the quality of architectural changes.

show abstract

“…Real-time 3D semantic mapping is often desired in a number of robotics applications, such as localization [ 1 , 2 ], semantic navigation [ 3 , 4 ] and human-aware navigation [ 5 ]. The semantic information provided with a 3D dense map is more useful than the geometric information [ 6 ] itself in robot-human or robot-environment interaction.…”

Section: Introductionmentioning

confidence: 99%

Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network

Zhao

Sun

Purkait

et al. 2018

Sensors

Self Cite

View full text Add to dashboard Cite

In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of different modalities. That is, the PixelNet can learn the high-level contextual information from 2D RGB images, and the VoxelNet can learn 3D geometrical shapes from the 3D point cloud. Unlike the existing architecture that fuses score maps from different modalities with equal weights, we propose a weighted fusion stack that adaptively learns the varying contributions of PixelNet and VoxelNet and fuses the score maps according to their respective confidence levels. Our approach achieved competitive results on both the SUN RGB-D and NYU V2 benchmarks, while the runtime of the proposed system is boosted to around 13 Hz, enabling near-real-time performance using an i7 eight-cores PC with a single Titan X GPU.

show abstract

Learning Monocular Visual Odometry with Dense 3D Mapping from Dense 3D Flow

Cited by 49 publications

References 32 publications

Ground-Plane-Based Absolute Scale Estimation for Monocular Visual Odometry

Ground-Plane-Based Absolute Scale Estimation for Monocular Visual Odometry

SteReFo: Efficient Image Refocusing with Stereo Vision

Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network

Contact Info

Product

Resources

About