Javier Civera scite author profile

DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes

Bescos

¹

,

Fácil

²

,

Civera

³

et al. 2018

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

The assumption of scene rigidity is typical in SLAM algorithms. Such a strong assumption limits the use of most visual SLAM systems in populated real-world environments, which are the target of several relevant applications like service robotics or autonomous vehicles.In this paper we present DynaSLAM, a visual SLAM system that, building on ORB-SLAM2 [1], adds the capabilities of dynamic object detection and background inpainting. DynaSLAM is robust in dynamic scenarios for monocular, stereo and RGB-D configurations. We are capable of detecting the moving objects either by multi-view geometry, deep learning or both. Having a static map of the scene allows inpainting the frame background that has been occluded by such dynamic objects. We evaluate our system in public monocular, stereo and RGB-D datasets. We study the impact of several accuracy/speed trade-offs to assess the limits of the proposed methodology. Dy-naSLAM outperforms the accuracy of standard visual SLAM baselines in highly dynamic scenarios. And it also estimates a map of the static parts of the scene, which is a must for long-term applications in real-world environments.

show abstract

Learning object class detectors from weakly annotated video

Prest¹,

Leistner²,

Civera³

et al. 2012

View full text Add to dashboard Cite

Object detectors are typically trained on a large set of still images annotated by bounding-boxes. This paper introduces an approach for learning object detectors from realworld web videos known only to contain objects of a target class. We propose a fully automatic pipeline that localizes objects in a set of videos of the class and learns a detector for it. The approach extracts candidate spatio-temporal tubes based on motion segmentation and then selects one tube per video jointly over all videos. To compare to the state of the art, we test our detector on still images, i.e., Pascal VOC 2007. We observe that frames extracted from web videos can differ significantly in terms of quality to still images taken by a good camera. Thus, we formulate the learning from videos as a domain adaptation task. We show that training from a combination of weakly annotated videos and fully annotated still images using domain adaptation improves the performance of a detector trained from still images alone.

show abstract

Inverse Depth Parametrization for Monocular SLAM

Civera

¹

,

Davison

²

,

Montiel

³

2008

View full text Add to dashboard Cite

Abstract-We present a new parametrization for point features within monocular simultaneous localization and mapping (SLAM) that permits efficient and accurate representation of uncertainty during undelayed initialization and beyond, all within the standard extended Kalman filter (EKF). The key concept is direct parametrization of the inverse depth of features relative to the camera locations from which they were first viewed, which produces measurement equations with a high degree of Manuscript received February 27, 2007; revised September 28, 2007 This paper has supplementary downloadable multimedia material available at http://ieeexplore.ieee.org provided by the author. This material includes the following video files. inverseDepth_indoor.avi (11.7 MB) shows simultaneous localization and mapping, from a hand-held camera observing an indoor scene. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. Player information XviD MPEG-4. inverseDepth_outdoor.avi (12.4 MB) shows real-time simultaneous localization and mapping, from a hand-held camera observing an outdoor scene, including rather distant features. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. The processing is done with a standard laptop. Player information XviD MPEG-4. inverseDepth_loopClosing.avi (10.2MB) shows simultaneous localization and mapping, from a hand-held camera observing a loop-closing indoor scene. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. Player information XviD MPEG-4. inverseDepth_loopClosing_ID_to_XYZ_conversion.avi (10.1 MB) shows simultaneous localization and mapping, from a hand-held camera observing the same loop-closing indoor sequence as in inverseDepth loopClosing.avi, but switching from inverse depth to XYZ parameterization when necessary. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second.

show abstract

Unified Inverse Depth Parametrization for Monocular SLAM

Montiel¹,

Civera²,

Davison³

2006

View full text Add to dashboard Cite

Abstract-We present a new parametrization for point features within monocular simultaneous localization and mapping (SLAM) that permits efficient and accurate representation of uncertainty during undelayed initialization and beyond, all within the standard extended Kalman filter (EKF). The key concept is direct parametrization of the inverse depth of features relative to the camera locations from which they were first viewed, which produces measurement equations with a high degree of Manuscript received February 27, 2007; revised September 28, 2007 This paper has supplementary downloadable multimedia material available at http://ieeexplore.ieee.org provided by the author. This material includes the following video files. inverseDepth_indoor.avi (11.7 MB) shows simultaneous localization and mapping, from a hand-held camera observing an indoor scene. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. Player information XviD MPEG-4. inverseDepth_outdoor.avi (12.4 MB) shows real-time simultaneous localization and mapping, from a hand-held camera observing an outdoor scene, including rather distant features. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. The processing is done with a standard laptop. Player information XviD MPEG-4. inverseDepth_loopClosing.avi (10.2MB) shows simultaneous localization and mapping, from a hand-held camera observing a loop-closing indoor scene. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second. Player information XviD MPEG-4. inverseDepth_loopClosing_ID_to_XYZ_conversion.avi (10.1 MB) shows simultaneous localization and mapping, from a hand-held camera observing the same loop-closing indoor sequence as in inverseDepth loopClosing.avi, but switching from inverse depth to XYZ parameterization when necessary. All the processing is automatic, the image sequence being the only sensorial information used as input. It is shown as a top view of the computed camera trajectory and 3-D scene map. Image sequence is acquired with a hand-held camera 320 £ 240 at 30 frames/second.

show abstract

1‐Point RANSAC for extended Kalman filtering: Application to real‐time structure from motion and visual odometry

Civera

¹

,

Grasa

²

,

Davison

³

et al. 2010

Journal of Field Robotics

View full text Add to dashboard Cite

Random sample consensus (RANSAC) has become one of the most successful techniques for robust estimation from a data set that may contain outliers. It works by constructing model hypotheses from random minimal data subsets and evaluating their validity from the support of the whole data. In this paper we present a novel combination of RANSAC plus extended Kalman filter (EKF) that uses the available prior probabilistic information from the EKF in the RANSAC model hypothesize stage. This allows the minimal sample size to be reduced to one, resulting in large computational savings without the loss of discriminative power. 1-Point RANSAC is shown to outperform both in accuracy and computational cost the joint compatibility branch and bound (JCBB) algorithm, a gold-standard technique for spurious rejection within the EKF framework. Two visual estimation scenarios are used in the experiments: first, six-degree-of-freedom (DOF) motion estimation from a monocular sequence (structure from motion). Here, a new method for benchmarking six-DOF visual estimation algorithms based on the use of high-resolution images is presented, validated, and used to show the superiority of 1-point RANSAC. Second, we demonstrate long-term robot trajectory estimation combining monocular vision and wheel odometry (visual odometry). Here, a comparison against global positioning system shows an accuracy comparable to state-of-the-art visual odometry methods. C

show abstract

Javier Civera

DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes

Learning object class detectors from weakly annotated video

Inverse Depth Parametrization for Monocular SLAM

Unified Inverse Depth Parametrization for Monocular SLAM

1‐Point RANSAC for extended Kalman filtering: Application to real‐time structure from motion and visual odometry

Contact Info

Product

Resources

About