Currently, most rescue robots are mainly teleoperated and integrate some level of autonomy to reduce the operator’s workload, allowing them to focus on the primary mission tasks. One of the main causes of mission failure are human errors and increasing the robot’s autonomy can increase the probability of success. For this reason, in this work, a stair detection and characterization pipeline is presented. The pipeline is tested on a differential drive robot using the ROS middleware, YOLOv4-tiny and a region growing based clustering algorithm. The pipeline’s staircase detector was implemented using the Neural Compute Engines (NCEs) of the OpenCV AI Kit with Depth (OAK-D) RGB-D camera, which allowed the implementation using the robot’s computer without a GPU and, thus, could be implemented in similar robots to increase autonomy. Furthermore, by using this pipeline we were able to implement a Fuzzy controller that allows the robot to align itself, autonomously, with the staircase. Our work can be used in different robots running the ROS middleware and can increase autonomy, allowing the operator to focus on the primary mission tasks. Furthermore, due to the design of the pipeline, it can be used with different types of RGB-D cameras, including those that generate noisy point clouds from low disparity depth images.
In this paper, an approach through a Deep Learning architecture for the three-dimensional reconstruction of outdoor environments in challenging terrain conditions is presented. The architecture proposed is configured as an Autoencoder. However, instead of the typical convolutional layers, some differences are proposed. The Encoder stage is set as a residual net with four residual blocks, which have been provided with the necessary knowledge to extract the feature maps from aerial images of outdoor environments. On the other hand, the Decoder stage is set as a Generative Adversarial Network (GAN) and called a GAN-Decoder. The proposed network architecture uses a sequence of the 2D aerial image as input. The Encoder stage works for the extraction of the vector of features that describe the input image, while the GAN-Decoder generates a point cloud based on the information obtained in the previous stage. By supplying a sequence of frames that a percentage of overlap between them, it is possible to determine the spatial location of each generated point. The experiments show that with this proposal it is possible to perform a 3D representation of an area flown over by a drone using the point cloud generated with a deep architecture that has a sequence of aerial 2D images as input. In comparison with other works, our proposed system is capable of performing three-dimensional reconstructions in challenging urban landscapes. Compared with the results obtained using commercial software, our proposal was able to generate reconstructions in less processing time, with less overlapping percentage between 2D images and is invariant to the type of flight path.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.