2021
DOI: 10.48550/arxiv.2111.08897
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data

Abstract: Scene understanding is an active research area. Commercial depth sensors, such as Kinect, have enabled the release of several RGB-D datasets over the past few years which spawned novel methods in 3D scene understanding. More recently with the launch of the LiDAR sensor in Apple's iPads and iPhones, high quality RGB-D data is accessible to millions of people on a device they commonly use. This opens a whole new era in scene understanding for the Computer Vision community as well as app developers. The fundament… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 32 publications
0
9
0
Order By: Relevance
“…Kinect V1/V2, 11 OAK-D-Lite, 12 and even iPhone's back facing camera satisfy the database requirements. 13 A depth image produced using iPhone camera is shown below in Figure 2. After experimentation with several cameras (iPhone X, OAK-D-Lite, Intel RealSense L515, Kinect V1, Kinect V2), it was decided to use Kinect V2 as it produced the most detailed depth map out of all tested cameras.…”
Section: Demonstration Robotic Systemmentioning
confidence: 99%
“…Kinect V1/V2, 11 OAK-D-Lite, 12 and even iPhone's back facing camera satisfy the database requirements. 13 A depth image produced using iPhone camera is shown below in Figure 2. After experimentation with several cameras (iPhone X, OAK-D-Lite, Intel RealSense L515, Kinect V1, Kinect V2), it was decided to use Kinect V2 as it produced the most detailed depth map out of all tested cameras.…”
Section: Demonstration Robotic Systemmentioning
confidence: 99%
“…Its elements correspond to the column-index in the matrix of corners. [2,8], [3,8], [1,3], [4,7], [7,5], [6,5], [4,6], [1,4], [2,7], [8,5], [3,6]] (4) [1,2,4], [1,3,4], [5,6,7], [5,6,8], [5,7,8]]…”
Section: Definition Of a Bounding Boxmentioning
confidence: 99%
“…As 3D object detection gets more popular and new datasets are published [1,2,3], evaluation metrics gain in importance. The most common one is Intersection over Union (IoU).…”
Section: Introductionmentioning
confidence: 99%
“…On-device depth estimation is critical in navigation [40], gaming [6], and augmented/virtual reality [3,8]. Previously, various solutions based on stereo/structured-light sensors and indirect time-of-flight sensors (iToF) [4, 34,55] have been proposed.…”
Section: Introductionmentioning
confidence: 99%
“…Each dToF pixel captures and pre-processes depth information from a local patch in the scene (Sec. 3), leading to high spatial ambiguity when estimating the high-resolution depth maps for downstream tasks [8]. Previous RGB-guided depth completion and super-resolution algorithms either assume high resolution spatial information (e.g.…”
Section: Introductionmentioning
confidence: 99%