Kubric: A scalable dataset generator

Greff, Klaus; Belletti, Francois; Beyer, Lucas; Doersch, Carl; Du, Yu; Duckworth, Daniel; Fleet, David J.; Gnanapragasam, Dan; Golemo, Florian; Herrmann, Charles; Kipf, Thomas; Kundu, Anjan; Lagun, Dmitry; Laradji, Issam; Hsueh-Ti,; Liu, Ming; Meyer, Henning; Miao, Yishu; Nowrouzezahrai, Derek; Öztireli, Cengiz; Pot, Etienne; Radwan, Noha; Rebain, Daniel; Sabour, Sara; Sajjadi, Mehdi S. M.; Sela, Matan; Sitzmann, Vincent; Stone, Austin V.; Sun, Daofeng; Vora, Suhani; Wang, Ziyu; Wu, Tianhao; Yi, Kwang Moo; Zhong, Fangcheng; Tagliasacchi, Andrea

doi:10.48550/arxiv.2203.03570

Cited by 5 publications

(7 citation statements)

References 77 publications

(113 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep learning-based computer vision methods promise to fundamentally alter what is possible in animal behavioural research [12][13][14][15][16][17][18]. A key remaining bottleneck is the "data-hunger" of supervised learning techniques: annotated datasets of the size and variability required to achieve robust, domain-invariant performance are rarely available, and in any case time-intensive to produce [36,54]. One strategy to overcome this limitation is to produce annotated data synthetically, using sufficiently realistic computer simulations [31,32,34,36].…”

Section: Discussionmentioning

confidence: 99%

“…A key remaining bottleneck is the "data-hunger" of supervised learning techniques: annotated datasets of the size and variability required to achieve robust, domain-invariant performance are rarely available, and in any case time-intensive to produce [36,54]. One strategy to overcome this limitation is to produce annotated data synthetically, using sufficiently realistic computer simulations [31,32,34,36]. In order to facilitate this process, we developed replicAnt: a synthetic data generator built in Unreal Engine 5 and Python.…”

Section: Discussionmentioning

confidence: 99%

“…In robotic [34][35][36], human [37][38][39][40], and automated driving [41][42][43] applications, annotated datasets comprising billions of images can now be produced "synthetically", i. e. through simulation with a computer. By placing 3D models in simulated environments, variable and annotated datasets can be generated at scale, and at a fraction of the cost and time required for hand-annotation of real images [39,40,42,44]. The use of synthetic data is particularly attractive where annotated real datasets are practically absent or only of insufficient size, as is the case for almost all non-human animal studies [22,[30][31][32][45][46][47][48][49][50][51].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

replicAnt: A pipeline for generating annotated images of animals in complex environments using Unreal Engine

Plum

Bulla²,

Beck

et al. 2023

Preprint

View full text Add to dashboard Cite

Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

replicAnt: A pipeline for generating annotated images of animals in complex environments using Unreal Engine

Plum

Bulla²,

Beck

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…We evaluate our model on five datasets. Four of them are synthetic multi-object datasets-CLEVR [35], CLEVRTex [37], MOVi-C, MOVi-E [24]. They present increasing levels of difficulty-CLEVRTex adds texture to objects and backgrounds, MOVi-C uses more complex objects and natural backgrounds, and MOVi-E contains large numbers of objects (up to 23) per scene.…”

Section: Methodsmentioning

confidence: 99%

Object-Centric Slot Diffusion

Jiang¹,

Deng²,

Singh³

et al. 2023

Preprint

View full text Add to dashboard Cite

Despite remarkable recent advances, making object-centric learning work for complex natural scenes remains the main challenge. The recent success of adopting the transformer-based image generative model in object-centric learning suggests that having a highly expressive image generator is crucial for dealing with complex scenes. In this paper, inspired by this observation, we aim to answer the following question: can we benefit from the other pillar of modern deep generative models, i.e., the diffusion models, for object-centric learning and what are the pros and cons of such a model? To this end, we propose a new object-centric learning model, Latent Slot Diffusion (LSD). LSD can be seen from two perspectives. From the perspective of object-centric learning, it replaces the conventional slot decoders with a latent diffusion model conditioned on the object slots. Conversely, from the perspective of diffusion models, it is the first unsupervised compositional conditional diffusion model which, unlike traditional diffusion models, does not require supervised annotation such as a text description to learn to compose. In experiments on various object-centric tasks, including the FFHQ dataset for the first time in this line of research, we demonstrate that LSD significantly outperforms the state-ofthe-art transformer-based decoder, particularly when the scene is more complex. We also show a superior quality in unsupervised compositional generation.

show abstract

“…A particular focus is placed on the acknowledgment of the simulation-to-real gap and how to tackle this particular challenge in the dataset generation process. Even though the first version of BlenderProc was one of the first tools to generate photo-realistic, synthetic datasets, many more tools exist nowadays, compared in Table 1 (Greff et al, 2022;Manolis Savva* et al, 2019;Morrical et al, 2021;Schwarz & Behnke, 2020;To et al, 2018). In contrast to the first version of BlenderProc, BlenderProc2 relies on an easy-to-use python API, whereas the first version used a YAML-based configuration approach (Denninger et al, 2019(Denninger et al, , 2020.…”

Section: Statement Of Needmentioning

confidence: 99%

BlenderProc2: A Procedural Pipeline for Photorealistic Rendering

Denninger¹,

Winkelbauer²,

Sundermeyer³

et al. 2023

JOSS

View full text Add to dashboard Cite

BlenderProc2 is a procedural pipeline that can render realistic images for the training of neural networks. Our pipeline can be employed in various use cases, including segmentation, depth, normal and pose estimation, and many others. A key feature of our Blender extension is the simple-to-use python API, designed to be easily extendable. Furthermore, many public datasets, such as 3D FRONT (Fu et al., 2021) or Shapenet (Chang et al., 2015), are already supported, making it easier to clutter synthetic scenes with additional objects.

show abstract

Kubric: A scalable dataset generator

Cited by 5 publications

References 77 publications

replicAnt: A pipeline for generating annotated images of animals in complex environments using Unreal Engine

replicAnt: A pipeline for generating annotated images of animals in complex environments using Unreal Engine

Object-Centric Slot Diffusion

BlenderProc2: A Procedural Pipeline for Photorealistic Rendering

Contact Info

Product

Resources

About