Semi-Supervised 6D Object Pose Estimation Without Using Real Annotations

Zhou, Guangliang; Wang, Deming; Yan, Yi; Chen, Huiyi; Chen, Qijun

doi:10.1109/tcsvt.2021.3138129

Cited by 18 publications

(6 citation statements)

References 71 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3. All the methods achieve considerable improvements over the initial model (Synthetic), indicating that the real image data, although noisy and outliercorrupted, are highly effective for decreasing the domain gap, as also reported in [12]- [15], [17]. But they still have large performance gaps from the model trained with YCB-v ground truth data, due to noises and the limited data size.…”

Section: A Ycb Video Experimentsmentioning

confidence: 52%

“…Semi-and self-supervised methods are developed to also exploit unlabeled real images to mitigate the lack of data. These methods typically train the model on synthetic data in a supervised manner and improve its performance on real images by semi-or self-supervised learning [10]- [17]. Many recent methods leverage differentiable rendering to develop end-to-end self-supervised pose estimation models, by encouraging similarity between real images and images rendered with estimated poses [12]- [15].…”

Section: Related Workmentioning

confidence: 99%

“…However, the robust M-estimation is typically solved with the IRLS method, where the measurement losses are re-weighted based on the Mahalanobis distance e k Σ k (see (17)). In comparison, our method re-weights the percomponent losses differently.…”

Section: Algorithm 1: Act By Alternating Minimizationmentioning

confidence: 99%

“…The robust cost is typically minimized with the IRLS method, by matching the gradients of ( 16) locally with a sequence of weighted least squares problems. The local least squares formulation at iteration i can be expressed as: (17) where w(•) = 1/| • | is the weight function. Under the isotropic noise assumption, i.e.…”

Section: B Reduction To L1 Robust M-estimatormentioning

confidence: 99%

“…Under the isotropic noise assumption, i.e. Σ k = σ k I, we can absorb the weight function into the covariance matrix and rewrite (17) as:…”

Section: B Reduction To L1 Robust M-estimatormentioning

confidence: 99%

See 4 more Smart Citations

SLAM-Supported Self-Training for 6D Object Pose Estimation

Lü¹,

Zhang²,

Doherty³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent progress in learning-based object pose estimation paves the way for developing richer object-level world representations. However, the estimators, often trained with out-of-domain data, can suffer performance degradation as deployed in novel environments. To address the problem, we present a SLAM-supported self-training procedure to autonomously improve robot object pose estimation ability during navigation. Combining the network predictions with robot odometry, we can build a consistent object-level environment map via pose graph optimization (PGO). Exploiting the state estimates from PGO, we pseudo-label robot-collected RGB images to fine-tune the pose estimators. Unfortunately, it is difficult to quantify the uncertainty of the estimator predictions. The unmodeled data uncertainty used for PGO can result in low-quality object pose estimates. An automatic covariance tuning method is developed for robust PGO by allowing the measurement uncertainty models to change as part of the optimization process. The formulation permits a straightforward alternating minimization procedure that re-scales covariances analytically and component-wise, enabling more flexible noise modeling for learning-based measurements. We test our method with the deep object pose estimator (DOPE) on the YCB video dataset and in real-world robot experiments. The method can achieve significant performance gain in pose estimation, and in return facilitates the success of object SLAM. 1

show abstract

Section: A Ycb Video Experimentsmentioning

confidence: 52%