Abstract-The state of the art in computer vision has rapidly advanced over the past decade largely aided by shared image datasets. However, most of these datasets tend to consist of assorted collections of images from the web that do not include 3D information or pose information. Furthermore, they target the problem of object category recognition-whereas solving the problem of object instance recognition might be sufficient for many robotic tasks. To address these issues, we present a highquality, large-scale dataset of 3D object instances, with accurate calibration information for every image. We anticipate that "solving" this dataset will effectively remove many perceptionrelated problems for mobile, sensing-based robots.The contributions of this work consist of: (1) BigBIRD, a dataset of 100 objects (and growing), composed of, for each object, 600 3D point clouds and 600 high-resolution (12 MP) images spanning all views, (2) a method for jointly calibrating a multi-camera system, (3) details of our data collection system, which collects all required data for a single object in under 6 minutes with minimal human effort, and (4) multiple software components (made available in open source), used to automate multi-sensor calibration and the data collection process. All code and data are available at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.