Deep learning has rapidly transformed the state of the art algorithms used to address a variety of problems in computer vision and robotics. These breakthroughs have relied upon massive amounts of human annotated training data. This time consuming process has begun impeding the progress of these deep learning efforts. This paper describes a method to incorporate photo-realistic computer images from a simulation engine to rapidly generate annotated data that can be used for the training of machine learning algorithms. We demonstrate that a state of the art architecture, which is trained only using these synthetic annotations, performs better than the identical architecture trained on human annotated real-world data, when tested on the KITTI data set for vehicle detection. By training machine learning algorithms on a rich virtual world, real objects in real scenes can be learned and classified using synthetic data. This approach offers the possibility of accelerating deep learning's application to sensorbased classification problems like those that appear in selfdriving cars. The source code and data to train and validate the networks described in this paper are made available for researchers.
Abstract-This paper reports on WaterGAN, a generative adversarial network (GAN) for generating realistic underwater images from in-air image and depth pairings in an unsupervised pipeline used for color correction of monocular underwater images. Cameras onboard autonomous and remotely operated vehicles can capture high resolution images to map the seafloor; however, underwater image formation is subject to the complex process of light propagation through the water column. The raw images retrieved are characteristically different than images taken in air due to effects such as absorption and scattering, which cause attenuation of light at different rates for different wavelengths. While this physical process is well described theoretically, the model depends on many parameters intrinsic to the water column as well as the structure of the scene. These factors make recovery of these parameters difficult without simplifying assumptions or field calibration; hence, restoration of underwater images is a non-trivial problem. Deep learning has demonstrated great success in modeling complex nonlinear systems but requires a large amount of training data, which is difficult to compile in deep sea environments. Using WaterGAN, we generate a large training dataset of corresponding depth, in-air color images, and realistic underwater images. This data serves as input to a two-stage network for color correction of monocular underwater images. Our proposed pipeline is validated with testing on real data collected from both a pure water test tank and from underwater surveys collected in the field. Source code, sample datasets, and pretrained models are made publicly available.
Robust, scalable simultaneous localization and mapping (SLAM) algorithms support the successful deployment of robots in real-world applications. In many cases these platforms deliver vast amounts of sensor data from large-scale, unstructured environments. These data may be difficult to interpret by end users without further processing and suitable visualization tools. We present a robust, automated system for large-scale three-dimensional (3D) reconstruction and visualization that takes stereo imagery from an autonomous underwater vehicle (AUV) and SLAM-based vehicle poses to deliver detailed 3D models of the seafloor in the form of textured polygonal meshes. Our system must cope with thousands of images, lighting conditions that create visual seams when texturing, and possible inconsistencies between stereo meshes arising from errors in calibration, triangulation, and navigation. Our approach breaks down the problem into manageable stages by first estimating local structure and then combining these estimates to recover a composite georeferenced structure using SLAM-based vehicle pose estimates. A texture-mapped surface at multiple scales is then generated that is interactively presented to the user through a visualization engine. We adapt established solutions when possible, with an emphasis on quickly delivering approximate yet visually consistent reconstructions on standard computing hardware. This allows scientists on a research cruise to use our system to design follow-up deployments of the AUV and complementary instruments. To date, this system has been tested on several research cruises in Australian waters and has been used to reliably generate and visualize reconstructions for more than 60 dives covering diverse habitats and representing hundreds of linear kilometers of survey. C 2009 Wiley Periodicals, Inc.
This paper demonstrates how multi-scale measures of rugosity, slope and aspect can be derived from fine-scale bathymetric reconstructions created from geo-referenced stereo imagery. We generate three-dimensional reconstructions over large spatial scales using data collected by Autonomous Underwater Vehicles (AUVs), Remotely Operated Vehicles (ROVs), manned submersibles and diver-held imaging systems. We propose a new method for calculating rugosity in a Delaunay triangulated surface mesh by projecting areas onto the plane of best fit using Principal Component Analysis (PCA). Slope and aspect can be calculated with very little extra effort, and fitting a plane serves to decouple rugosity from slope. We compare the results of the virtual terrain complexity calculations with experimental results using conventional in-situ measurement methods. We show that performing calculations over a digital terrain reconstruction is more flexible, robust and easily repeatable. In addition, the method is non-contact and provides much less environmental impact compared to traditional survey techniques. For diver-based surveys, the time underwater needed to collect rugosity data is significantly reduced and, being a technique based on images, it is possible to use robotic platforms that can operate beyond diver depths. Measurements can be calculated exhaustively at multiple scales for surveys with tens of thousands of images covering thousands of square metres. The technique is demonstrated on data gathered by a diver-rig and an AUV, on small single-transect surveys and on a larger, dense survey that covers over . Stereo images provide 3D structure as well as visual appearance, which could potentially feed into automated classification techniques. Our multi-scale rugosity, slope and aspect measures have already been adopted in a number of marine science studies. This paper presents a detailed description of the method and thoroughly validates it against traditional in-situ measurements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.