In this work, we propose a method for estimating depth for an image of a monocular camera in order to avoid a collision for the autonomous flight of a drone. The highest flight speed of a drone is generally approximate 22.2 m/s, and long-distant depth information is crucial for autonomous flights since if the long-distance information is not available, the drone flying at high speeds is prone to collisions. However, long-range, measurable depth cameras are too heavy to be equipped on a drone. This work applies Pix2Pix, which is a kind of Conditional Generative Adversarial Nets (CGAN). Pix2Pix generates depth images from a monocular camera. Additionally, this work applies optical flow to enhance the accuracy of depth estimation. In this work, we propose a highly accurate depth estimation method that effectively embeds an optical flow map into a monocular image. The models are trained with taking advantage of AirSim, which is one of the flight simulators. AirSim can take both monocular and depth images over a hundred meter in the virtual environment, and our model generates a depth image that provides the long-distance information than images captured by a common depth camera. We evaluate accuracy and error of our proposed method using test images in AirSim. In addition, the proposed method is utilized for flight simulation to evaluate the effectiveness to collision avoidance. As a result, our proposed method is higher accuracy and lower error than a state of work. Moreover, our proposed method is lower collision than a state of work.
Recent years, autonomous drones have attracted attention in many fields due to their convenience. Autonomous drones require precise depth information so as to avoid collision to fly fast and both of RGB image and LiDAR point cloud are often employed in applications based on Convolutional Neural Networks (CNNs) to estimate the distance to obstacles. Such applications are implemented onboard embedded systems. In order to precisely estimate the depth, such CNN models are in general so complex to extract many features that the computational complexity increases, requiring long inference time. In order to solve the issue, we employ optical flow to aid in-depth estimation. In addition, we propose a new attention structure that makes maximum use of optical flow without complicating the network. Furthermore, we achieve improved performance without modifying the depth estimator by adding a perceptual discriminator in training. The proposed model is evaluated through accuracy, error, and inference time on the KITTI dataset. In the experiments, we have demonstrated the proposed method achieves better performance by up to 34% accuracy, 55% error reduction and 66% faster inference time on Jetson nano compared to previous methods. The proposed method is also evaluated through a collision avoidance in simulated drone flight and achieves the lowest collision rate of all estimation methods. These experimental results show the potential of proposed method to be used in real-world autonomous drone flight applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.