Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects -chromatic aberration, blur, exposure, noise, and color temperature -for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes.
By JACEP Open policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per ICMJE conflict of interest guidelines (see www.icmje.org). The authors have stated that no such relationships exist.
Performance on benchmark datasets has drastically improved with advances in deep learning. Still, crossdataset generalization performance remains relatively low due to the domain shift that can occur between two different datasets. This domain shift is especially exaggerated between synthetic and real datasets. Significant research has been done to reduce this gap, specifically via modeling variation in the spatial layout of a scene, such as occlusions, and scene environmental factors, such as time of day and weather effects. However, few works have addressed modeling the variation in the sensor domain as a means of reducing the synthetic to real domain gap. The camera or sensor used to capture a dataset introduces artifacts into the image data that are unique to the sensor model, suggesting that sensor effects may also contribute to domain shift. To address this, we propose a learned augmentation network composed of physically-based augmentation functions. Our proposed augmentation pipeline transfers specific effects of the sensor model -chromatic aberration, blur, exposure, noise, and color temperature -from a real dataset to a synthetic dataset. We provide experiments that demonstrate that augmenting synthetic training datasets with the proposed learned augmentation framework reduces the domain gap between synthetic and real domains for object detection in urban driving scenes. 1 A. Carlson and K. A. Skinner are with the Robotics Institute, University of Michigan,
There are particular similarities in how machines learn about the nature of their environment, and how humans learn to process visual stimuli. Machine Learning (ML), more specifically Deep Neural network algorithms rely on expansive image databases and various training methods (supervised, unsupervised) to “make sense” out of the content of an image. Take for example how students of architecture learn to differentiate various architectural styles. Whether this be to differentiate between Gothic, Baroque or Modern Architecture, students are exposed to hundreds, or even thousands of images of the respective styles, while being trained by faculty to be able to differentiate between those styles. A reversal of the process, striving to produce imagery, instead of reading it and understanding its content, allows machine vision techniques to be utilized as a design methodology that profoundly interrogates aspects of agency and authorship in the presence of Artificial Intelligence in architecture design. This notion forms part of a larger conversation on the nature of human ingenuity operating within a posthuman design ecology. The inherent ability of Neural Networks to process large databases opens up the opportunity to sift through the enormous repositories of imagery generated by the architecture discipline through the ages in order to find novel and bespoke solutions to architectural problems. This article strives to demystify the romantic idea of individual artistic design choices in architecture by providing a glimpse under the hood of the inner workings of Neural Network processes, and thus the extent of their ability to inform architectural design. The approach takes cues from the language and methods employed by experts in Deep Learning such as Hallucinations, Dreaming, Style Transfer and Vision. The presented approach is the base for an in-depth exploration of its meaning as a cultural technique within the discipline. Culture in the extent of this article pertains to ideas such as the differentiation between symbolic and material cultures, in which symbols are defined as the common denominator of a specific group of people.1 The understanding and exchange of symbolic values is inherently connected to language and code, which ultimately form the ingrained texture of any form of coded environment, including the coded structure of Neural Networks. A first proof of concept project was devised by the authors in the form of the Robot Garden. What makes the Robot Garden a distinctively novel project is the motion from a purely two dimensional approach to designing with the aid of Neural Networks, to the exploration of 2D to 3D Neural Style Transfer methods in the design process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.