2022 International Conference on Robotics and Automation (ICRA) 2022
DOI: 10.1109/icra46639.2022.9811799
|View full text |Cite
|
Sign up to set email alerts
|

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 47 publications
(17 citation statements)
references
References 43 publications
0
17
0
Order By: Relevance
“…Notably, only SDFEst [23] supports multiview setups, only iCaps [22] supports tracking over time, and only three methods include the detection part in their pipeline [25,49,24]. The other methods assume that an off-the-shelf detec-tor (typically Mask R-CNN [10]) is available, but do not train it end-to-end with the pose and shape estimation part.…”
Section: Related Workmentioning
confidence: 99%
“…Notably, only SDFEst [23] supports multiview setups, only iCaps [22] supports tracking over time, and only three methods include the detection part in their pipeline [25,49,24]. The other methods assume that an off-the-shelf detec-tor (typically Mask R-CNN [10]) is available, but do not train it end-to-end with the pose and shape estimation part.…”
Section: Related Workmentioning
confidence: 99%
“…Most methods (including ours) employ two-stage pipelines in which an object detection or instance segmentation module first detects bounding boxes or masks, which are later used to estimate the object's pose and shape. In contrast, Irshad et al [25] proposed to use a single-shot architecture to detect objects and estimate their shape and pose jointly. While such an end-to-end approach might be easier to scale, data collection and data generation becomes more challenging compared to two-stage approaches, which can benefit from large-scale segmentation datasets.…”
Section: B Categorical Pose and Shape Estimationmentioning
confidence: 99%
“…CASS [34] learns a canonical shape space by VAE [35] to obtain a view-factorized RGB-D embedding. CenterSnap [36] presents a one-stage pipeline to reduce the computational cost. 6-PACK [37] recovers the pose by tracking inter-frame motion of the object.…”
Section: B Category-level Object Pose Estimationmentioning
confidence: 99%