Vehicle pose estimation via regression of semantic points of interest

Lopez, Javier; Agudo, Antonio; Moreno-Noguer, Francesc

doi:10.1109/ispa.2019.8868508

“…Pipelines using deep learning have seen great successes in areas such as human pose estimation [29,50,52,75,76], and pose estimation of household objects [23,48,56]. With the growing interest in self-driving vehicles, research has also focused on jointly estimating vehicle shape and pose [12,33,37,44,69]. Many open-source driving datasets have also been released for benchmarking [11,68,80].…”

Section: Related Workmentioning

confidence: 99%

“…Despite its popularity (e.g., the model is also used in human shape estimation and face detection [89]), pose estimation with active shape models leads to a non-convex optimization problem and local solvers get stuck in poor solutions, and are sensitive to outliers [84,89]. More recently, research effort has been devoted to end-to-end learning-based 3D pose estimation with encouraging results in human pose estimation [35] and vehicle pose estimation [12,33,37,44,69]; these approaches still require a large amount of 3D labeled data, which is hard to obtain in the wild.…”

Section: Introductionmentioning

confidence: 99%

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Shi

¹

,

Yang

²

,

Carlone

³

2021

Robotics: Science and Systems XVII

View full text Add to dashboard Cite

We consider a category-level perception problem, where one is given 3D sensor data picturing an object of a given category (e.g., a car), and has to reconstruct the pose and shape of the object despite intra-class variability (i.e., different car models have different shapes). We consider an active shape model, where -for an object category-we are given a library of potential CAD models describing objects in that category, and we adopt a standard formulation where pose and shape estimation are formulated as a non-convex optimization. Our first contribution is to provide the first certifiably optimal solver for pose and shape estimation. In particular, we show that rotation estimation can be decoupled from the estimation of the object translation and shape, and we demonstrate that (i) the optimal object rotation can be computed via a tight (small-size) semidefinite relaxation, and (ii) the translation and shape parameters can be computed in closed-form given the rotation. Our second contribution is to add an outlier rejection layer to our solver, hence making it robust to a large number of misdetections. Towards this goal, we wrap our optimal solver in a robust estimation scheme based on graduated non-convexity. To further enhance robustness to outliers, we also develop the first graph-theoretic formulation to prune outliers in category-level perception, which removes outliers via convex hull and maximum clique computations; the resulting approach is robust to 70 − 90% outliers. Our third contribution is an extensive experimental evaluation. Besides providing an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we combine our solver with a deep-learned keypoint detector, and show that the resulting approach improves over the state of the art in vehicle pose estimation in the ApolloScape datasets.

show abstract

“…Pipelines using deep learning have seen great successes in areas such as human pose estimation [29,49,51,74,75], and pose estimation of household objects [23,47,55]. With the growing interest in self-driving vehicles, research has also focused on jointly estimating vehicle shape and pose [12,33,37,44,68]. Many open-source driving datasets have also been released for benchmarking [11,67,79].…”

Section: Related Workmentioning

confidence: 99%

“…Despite its popularity (e.g., the model is also used in human shape estimation and face detection [88]), pose estimation with active shape models leads to a non-convex optimization problem and local solvers get stuck in poor solutions, and are sensitive to outliers [83,88]. More recently, research effort has been devoted to end-to-end learning-based 3D pose estimation with encouraging results in human pose estimation [35] and vehicle pose estimation [12,33,37,44,68]; these approaches still require a large amount of 3D labeled data, which is hard to obtain in the wild.…”

Section: Introductionmentioning

confidence: 99%

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Shi

¹

,

Yang

²

,

Carlone

³

2021

Preprint

0

View full text Add to dashboard Cite

We consider a category-level perception problem, where one is given 3D sensor data picturing an object of a given category (e.g., a car), and has to reconstruct the pose and shape of the object despite intra-class variability (i.e., different car models have different shapes). We consider an active shape model, where -for an object category-we are given a library of potential CAD models describing objects in that category, and we adopt a standard formulation where pose and shape estimation are formulated as a non-convex optimization. Our first contribution is to provide the first certifiably optimal solver for pose and shape estimation. In particular, we show that rotation estimation can be decoupled from the estimation of the object translation and shape, and we demonstrate that (i) the optimal object rotation can be computed via a tight (small-size) semidefinite relaxation, and (ii) the translation and shape parameters can be computed in closed-form given the rotation. Our second contribution is to add an outlier rejection layer to our solver, hence making it robust to a large number of misdetections. Towards this goal, we wrap our optimal solver in a robust estimation scheme based on graduated non-convexity. To further enhance robustness to outliers, we also develop the first graph-theoretic formulation to prune outliers in category-level perception, which removes outliers via convex hull and maximum clique computations; the resulting approach is robust to 70 − 90% outliers. Our third contribution is an extensive experimental evaluation. Besides providing an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we combine our solver with a deep-learned keypoint detector, and show that the resulting approach improves over the state of the art in vehicle pose estimation in the ApolloScape datasets.

show abstract

“…Object pose estimation from 3D point clouds is an important problem in robot perception, with applications including industrial robotics [3]- [6], self-driving cars [7]- [13], and domestic robotics [1], [2], [14]- [17]. Availability of pose-annotated datasets has fueled recent progress towards solving this problem [1]- [3], [7]- [9].…”

Section: Introductionmentioning

confidence: 99%

Certifiable 3D Object Pose Estimation: Foundations, Learning Models, and Self-Training

Talak¹,

Peng²,

Carlone³

2022

Preprint

0

View full text Add to dashboard Cite

We consider an object pose estimation and model fitting problem, where -given a partial point cloud of an objectthe goal is to estimate the object pose by fitting a CAD model to the sensor data. We solve this problem by combining (i) a semantic keypoint-based pose estimation model, (ii) a novel selfsupervised training approach, and (iii) a certification procedure, that not only verifies whether the output produced by the model is correct or not, but also flags uniqueness of the produced solution. The semantic keypoint detector model is initially trained in simulation and does not perform well on real-data due to the domain gap. Our self-supervised training procedure uses a corrector and a certification module to improve the detector. The corrector module corrects the detected keypoints to compensate for the domain gap, and is implemented as a declarative layer, for which we develop a simple differentiation rule. The certification module declares whether the corrected output produced by the model is certifiable (i.e., correct) or not. At each iteration, the approach optimizes over the loss induced only by the certifiable input-output pairs. As training progresses, we see that the fraction of outputs that are certifiable increases, eventually reaching near 100% in many cases. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets, and show the proposed approach achieves performance comparable to fully supervised baselines while not requiring pose or keypoint supervision on real data.

show abstract

Vehicle pose estimation via regression of semantic points of interest

Cited by 12 publications

References 30 publications

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Certifiable 3D Object Pose Estimation: Foundations, Learning Models, and Self-Training

Contact Info

Product

Resources

About