2020
DOI: 10.48550/arxiv.2012.09988
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations

Abstract: 3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. We introduce the Objectron dataset to advance the state of the art in 3D object detection and foster new research and applications, such as 3D object tracking, view synthesis, and improved 3D shape representation. The dataset contains object-centric short videos with pose annotations for nine categories and includes 4 million annotated images in 14, 819 annotated videos. We al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…Multi-View Cars [24] and BigBIRD [49] are real-world object-focused multi-view datasets but have limited numbers of instances and categories. The Object Scans dataset [6] and Objectron [1] are both video datasets that have the camera operator walk around various objects, but are similarly limited in the number of categories represented in the dataset. CO3D [46] also offers videos of common objects from 50 different categories, however they do not provide full 3D mesh reconstructions.…”
Section: Related Workmentioning
confidence: 99%
“…Multi-View Cars [24] and BigBIRD [49] are real-world object-focused multi-view datasets but have limited numbers of instances and categories. The Object Scans dataset [6] and Objectron [1] are both video datasets that have the camera operator walk around various objects, but are similarly limited in the number of categories represented in the dataset. CO3D [46] also offers videos of common objects from 50 different categories, however they do not provide full 3D mesh reconstructions.…”
Section: Related Workmentioning
confidence: 99%
“…The former lacks scene level labels and the latter lacks oriented 3D bounding box labels. More recent datasets such as [36] provide a variety of objects with 3D labels but do not provide depth sensor data. ARKitScenes provides the largest set of 3D oriented bounding boxes for a set of 17 room-defining object categories that addresses the gaps of previous indoor datasets.…”
Section: Related Workmentioning
confidence: 99%
“…Objectron [29] dataset is a collection of short objectcentric video clips, accompanied by AR session metadata, including camera poses, sparse point clouds, and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different views.…”
Section: Dataset For Category Level Object Pose Detection and Trackingmentioning
confidence: 99%
“…Then, the up-to-scale 9Dof object bounding box is recovered by solving the EPnP [43] problem. Lately, Ahmadyan et al [29] release the Objectron dataset they used and introduce MobilePose v2, a two-stage category-level 6D pose detection pipeline built upon MobilePose. In MobilePose v2, SSD [36] is first used to detect 2D object patches in the image.…”
Section: Category Level Monocular 6d Pose Detectionmentioning
confidence: 99%