ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data

Baruch, Gilad; Chen, Zhuoyuan; Dehghan, Afshin; Dimry, Tal; Feigin, Yuri; Fu, Peter P.; Gebauer, Thomas; Joffe, Brandon; Kurz, Daniel; Schwartz, Arik; Shulman, Elad

doi:10.48550/arxiv.2111.08897

Cited by 5 publications

(9 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Kinect V1/V2, 11 OAK-D-Lite, 12 and even iPhone's back facing camera satisfy the database requirements. 13 A depth image produced using iPhone camera is shown below in Figure 2. After experimentation with several cameras (iPhone X, OAK-D-Lite, Intel RealSense L515, Kinect V1, Kinect V2), it was decided to use Kinect V2 as it produced the most detailed depth map out of all tested cameras.…”

Section: Demonstration Robotic Systemmentioning

confidence: 99%

RGB-D Robotic Pose Estimation For a Servicing Robotic Arm

Herron¹,

López²,

Jordan³

et al. 2022

Preprint

View full text Add to dashboard Cite

A large number of robotic and human-assisted missions to the Moon and Mars are forecast. NASA's efforts to learn about the geology and makeup of these celestial bodies rely heavily on the use of robotic arms. The safety and redundancy aspects will be crucial when humans will be working alongside the robotic explorers. Additionally, robotic arms are crucial to satellite servicing and planned orbit debris mitigation missions. The goal of this work is to create a custom Computer Vision (CV) based Artificial Neural Network (ANN) that would be able to rapidly identify the posture of a 7 Degree of Freedom (DoF) robotic arm from a single (RGB-D) image -just like humans can easily identify if an arm is pointing in some general direction. The Sawyer robotic arm is used for developing and training this intelligent algorithm. Since Sawyer's joint space spans 7 dimensions, it is an insurmountable task to cover the entire joint configuration space. In this work, orthogonal arrays are used, similar to the Taguchi method, to efficiently span the joint space with the minimal number of training images. This "optimally" generated database is used to train the custom ANN and its degree of accuracy is on average equal to twice the smallest joint displacement step used for database generation. A pre-trained ANN will be useful for estimating the postures of robotic manipulators used on space stations, spacecraft, and rovers as an auxiliary tool or for contingency plans.

show abstract

Section: Demonstration Robotic Systemmentioning

confidence: 99%

RGB-D Robotic Pose Estimation For a Servicing Robotic Arm

Herron¹,

López²,

Jordan³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Its elements correspond to the column-index in the matrix of corners. [2,8], [3,8], [1,3], [4,7], [7,5], [6,5], [4,6], [1,4], [2,7], [8,5], [3,6]] (4) [1,2,4], [1,3,4], [5,6,7], [5,6,8], [5,7,8]]…”

Section: Definition Of a Bounding Boxmentioning

confidence: 99%

“…As 3D object detection gets more popular and new datasets are published [1,2,3], evaluation metrics gain in importance. The most common one is Intersection over Union (IoU).…”

Section: Introductionmentioning

confidence: 99%

Bounding Box Disparity: 3D Metrics for Object Detection with Full Degree of Freedom

Adam

Piccolrovazzi

Eger

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

The most popular evaluation metric for object detection in 2D images is Intersection over Union (IoU). Existing implementations of the IoU metric for 3D object detection usually neglect one or more degrees of freedom. In this paper, we first derive the analytic solution for three dimensional bounding boxes. As a second contribution, a closed-form solution of the volume-to-volume distance is derived. Finally, the Bounding Box Disparity is proposed as a combined positive continuous metric. We provide open source implementations of the three metrics as standalone python functions, as well as extensions to the Open3D library and as ROS nodes.

show abstract

“…On-device depth estimation is critical in navigation [40], gaming [6], and augmented/virtual reality [3,8]. Previously, various solutions based on stereo/structured-light sensors and indirect time-of-flight sensors (iToF) [4, 34,55] have been proposed.…”

Section: Introductionmentioning

confidence: 99%

“…Each dToF pixel captures and pre-processes depth information from a local patch in the scene (Sec. 3), leading to high spatial ambiguity when estimating the high-resolution depth maps for downstream tasks [8]. Previous RGB-guided depth completion and super-resolution algorithms either assume high resolution spatial information (e.g.…”

Section: Introductionmentioning

confidence: 99%

Consistent Direct Time-of-Flight Video Depth Super-Resolution

Sun¹,

Ye²,

Xiong³

et al. 2022

Preprint

View full text Add to dashboard Cite

Figure 1. We propose the first multi-frame approaches, dToF depth video super-resolution (DVSR) and histogram video super-resolution (HVSR), to super-resolve low-resolution dToF sensor videos with the high-resolution RGB frame guidance. The point cloud visualizations of depth predictions reveal that, by utilizing multi-frame correlation, DVSR predicts significantly better geometry compared to state-of-theart per-frame depth enhancement networks [41] while being more lightweight; HVSR further improves the fidelity of geometry and reduces flying pixels by utilizing the dToF histogram information. Besides the improvements in per-frame estimation, we highly recommend readers to check out the supplementary video, which visualizes the significant improvements in temporal stability across the entire sequences.

show abstract

ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data

Cited by 5 publications

References 32 publications

RGB-D Robotic Pose Estimation For a Servicing Robotic Arm

RGB-D Robotic Pose Estimation For a Servicing Robotic Arm

Bounding Box Disparity: 3D Metrics for Object Detection with Full Degree of Freedom

Consistent Direct Time-of-Flight Video Depth Super-Resolution

Contact Info

Product

Resources

About