Safe UAV landing: A low-complexity pipeline for surface conditions recognition

Tsintotas, Konstantinos A.; Bampis, Loukas; Taitzoglou, Anastasios; Kansizoglou, Ioannis; Γαστεράτος, Αντώνιος

doi:10.1109/ist50367.2021.9651358

Cited by 15 publications

(2 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While the pipeline in the context of this paper is limited in detecting and avoiding only people during landing, this can be easily altered by training the network of the PCG module to detect a different set of landing obstacles. The pipeline can also be extended to integrate a slope detection module [31], so that uneven terrains would be excluded as well. Another way that the framework can be extended is to give each landing spot a score based on a multi-criteria analysis in relation to slope, people density and activity, distance from base or UAV, detection confidence etc.…”

Section: Discussionmentioning

confidence: 99%

Embedded light-weight approach for safe landing in populated areas

Mitroudas¹,

Balaska²,

Psomoulis³

et al. 2023

Preprint

View full text Add to dashboard Cite

Landing safety is a challenge heavily engaging the research community recently, due to the increasing interest in applications availed by aerial vehicles. In this paper, we propose a landing safety pipeline based on state of the art object detectors and OctoMap. First, a point cloud of surface obstacles is generated, which is then inserted in an OctoMap. The unoccupied areas are identified, thus resulting to a list of safe landing points. Due to the low inference time achieved by state of the art object detectors and the efficient point cloud manipulation using OctoMap, it is feasible for our approach to deploy on low-weight embedded systems. The proposed pipeline has been evaluated in many simulation scenarios, varying in people density, number, and movement. Simulations were executed with an Nvidia Jetson Nano in the loop to confirm the pipeline's performance and robustness in a low computing power hardware. The experiments yielded promising results with a 95% success rate.

show abstract

Section: Discussionmentioning

confidence: 99%

Embedded light-weight approach for safe landing in populated areas

Mitroudas¹,

Balaska²,

Psomoulis³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Despite the promising results of the former systems, their need to adapt to changes (e.g., environmental, frame background, or camera motion) between videos containing the same action presents a disadvantage. In contrast, the latter can adjust to the above challenges, showing remarkable outcomes in different computer vision and robotics tasks [26] (e.g., image recognition [27], object detection [28,29], visual-based navigation [30,31], place recognition [32][33][34], loop closure detection [35,36], and video description [37]). In particular, these approaches use two-dimensional CNNs (2D-CNNs) that receive a grid of values as input (i.e., an image) and subsequently perform spatial analysis via 2D convolutional filters.…”

Section: Introductionmentioning

confidence: 99%

Evaluating the Performance of Mobile-Convolutional Neural Networks for Spatial and Temporal Human Action Recognition Analysis

Moutsis,

Tsintotas,

Kansizoglou

et al. 2023

Robotics

Self Cite

View full text Add to dashboard Cite

Human action recognition is a computer vision task that identifies how a person or a group acts on a video sequence. Various methods that rely on deep-learning techniques, such as two- or three-dimensional convolutional neural networks (2D-CNNs, 3D-CNNs), recurrent neural networks (RNNs), and vision transformers (ViT), have been proposed to address this problem over the years. Motivated by the fact that most of the used CNNs in human action recognition present high complexity, and the necessity of implementations on mobile platforms that are characterized by restricted computational resources, in this article, we conduct an extensive evaluation protocol over the performance metrics of five lightweight architectures. In particular, we examine how these mobile-oriented CNNs (viz., ShuffleNet-v2, EfficientNet-b0, MobileNet-v3, and GhostNet) execute in spatial analysis compared to a recent tiny ViT, namely EVA-02-Ti, and a higher computational model, ResNet-50. Our models, previously trained on ImageNet and BU101, are measured for their classification accuracy on HMDB51, UCF101, and six classes of the NTU dataset. The average and max scores, as well as the voting approaches, are generated through three and fifteen RGB frames of each video, while two different rates for the dropout layers were assessed during the training. Last, a temporal analysis via multiple types of RNNs that employ features extracted by the trained networks is examined. Our results reveal that EfficientNet-b0 and EVA-02-Ti surpass the other mobile-CNNs, achieving comparable or superior performance to ResNet-50.

show abstract