Abstract-Present image based visual servoing approaches rely on extracting hand crafted visual features from an image. Choosing the right set of features is important as it directly affects the performance of any approach. Motivated by recent breakthroughs in performance of data driven methods on recognition and localization tasks, we aim to learn visual feature representations suitable for servoing tasks in unstructured and unknown environments. In this paper, we present an end-to-end learning based approach for visual servoing in diverse scenes where the knowledge of camera parameters and scene geometry is not available a priori. This is achieved by training a convolutional neural network over color images with synchronised camera poses. Through experiments performed in simulation and on a quadrotor, we demonstrate the efficacy and robustness of our approach for a wide range of camera poses in both indoor as well as outdoor environments.
Existing deep learning based visual servoing approaches regress the relative camera pose between a pair of images. Therefore, they require a huge amount of training data and sometimes fine-tuning for adaptation to a novel scene. Furthermore, current approaches do not consider underlying geometry of the scene and rely on direct estimation of camera pose. Thus, inaccuracies in prediction of the camera pose, especially for distant goals, lead to a degradation in the servoing performance. In this paper, we propose a two-fold solution: (i) We consider optical flow as our visual features, which are predicted using a deep neural network. (ii) These flow features are then systematically integrated with depth estimates provided by another neural network using interaction matrix. We further present an extensive benchmark in a photo-realistic 3D simulation across diverse scenes to study the convergence and generalisation of visual servoing approaches. We show convergence for over 3m and 40 degrees while maintaining precise positioning of under 2cm and 1 degree on our challenging benchmark where the existing approaches that are unable to converge for majority of scenarios for over 1.5m and 20 degrees. Furthermore, we also evaluate our approach for a real scenario on an aerial robot. Our approach generalizes to novel scenarios producing precise and robust servoing performance for 6 degrees of freedom positioning tasks with even large camera transformations without any retraining or fine-tuning.
Visual servoing approaches navigate a robot to the desired pose with respect to a given object using image measurements. As a result, these approaches have several applications in manipulation, navigation and inspection. However, existing visual servoing approaches are instance specific, that is, they control camera motion between two views of the same object. In this paper, we present a framework for visual servoing to a novel object instance. We further employ our framework for the autonomous inspection of vehicles using Micro Aerial Vehicles (MAVs), which is vital for day‐to‐day maintenance, damage assessment, and merchandising a vehicle. This visual inspection task comprises the MAV visiting the essential parts of the vehicle, for example, wheels, lights, and so forth, to get a closer look at the damages incurred. Existing methods for autonomous inspection could not be extended for vehicles due to the following reasons: First, several existing methods require a 3D model of the structure, which is not available for every vehicle. Second, existing methods require expensive depth sensor for localization and path planning. Third, current approaches do not account for the semantic understanding of the vehicle, which is essential for identifying parts. Our instance invariant visual servoing framework is capable of autonomously navigating to every essential part of a vehicle for inspection and can be initialized from any random pose. To the best our knowledge, this is the first approach demonstrating fully autonomous visual inspection of vehicles using MAVs. We have validated the efficacy of our approach through a series of experiments in simulation and outdoor scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.