Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of autonomous vehicles. These heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor for free space detection. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression-based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a uncertainty aware free space detection algorithm.
Increasingly, the task of detecting and recognising the actions of a human has been delegated to some form of neural network processing camera or wearable sensor data. Due to the degree to which the camera can be affected by lighting and wearable sensors scantiness, neither one modality can capture the required data to perform the task confidently. That being the case, range sensors like Light Detection and Ranging (LiDAR) can complement the process to perceive the environment more robustly. Most recently, researchers have been exploring ways to apply Convolutional Neural Networks to 3-Dimensional data. These methods typically rely on a single modality and cannot draw on information from complementing sensor streams to improve accuracy. This paper proposes a framework to tackle human activity recognition by leveraging the benefits of sensor fusion and multimodal machine learning. Given both RGB and point cloud data, our method describes the activities being performed by subjects using regions with a convolutional neural network (R-CNN) and a 3-Dimensional modified Fisher vector network. Evaluated on a custom captured multimodal dataset demonstrates that the model outputs remarkably accurate human activity classification (90%). Furthermore, this framework can be used for sports analytics, understanding social behaviour, surveillance, and perhaps most notably by Autonomous Vehicles (AV) to datadriven decision-making policies in urban areas and indoor environments.
Creating an accurate awareness of the environment using laser scanners is a major challenge in robotics and auto industries. LiDAR (light detection and ranging) is a powerful laser scanner that provides a detailed map of the environment. However, efficient and accurate mapping of the environment is yet to be obtained, as most modern environments contain glass, which is invisible to LiDAR. In this paper, a method to effectively detect and localise glass using LiDAR sensors is proposed. This new approach is based on the variation of range measurements between neighbouring point clouds, using a two-step filter. The first filter examines the change in the standard deviation of neighbouring clouds. The second filter uses a change in distance and intensity between neighbouring pules to refine the results from the first filter and estimate the glass profile width before updating the cartesian coordinate and range measurement by the instrument. Test results demonstrate the detection and localisation of glass and the elimination of errors caused by glass in occupancy grid maps. This novel method detects frameless glass from a long range and does not depend on intensity peak with an accuracy of 96.2%.
Increasingly complex automated driving functions, specifically those associated with Free Space Detection (FSD), are delegated to Convolutional Neural Networks (CNN). If the dataset used to train the network lacks diversity, modality or sufficient quantities, the driver policy that controls the vehicle may induce safety risks. Although most autonomous ground vehicles (AGV) perform well in structured surroundings, the need for human intervention significantly rises when presented with unstructured niche environments. To this end, we developed an AGV for seamless indoor and outdoor navigation to collect realistic multimodal data streams. We demonstrate one application of the AGV when applied to a self-evolving FSD framework that leverages online active machine learning (ML) paradigms and sensor data fusion. In essence, the self-evolving AGV queries image data against a reliable data stream, ultrasound, before fusing the sensor data to improve robustness. We compare the proposed framework to one of the most prominent free space segmentation methods, DeepLabV3+ [1]. DeepLabV3+ [1] is a state-of-the-art semantic segmentation model composed of a CNN and an auto-decoder. In consonance with the results, the proposed framework out preforms DeepLabV3+ [1]. The performance of the proposed framework is attributed to its ability to self-learn free space. This combination of online and active ML removes the need for large datasets typically required by a CNN. Moreover, this technique provides case-specific free space classifications based on information gathered from the scenario at hand.
Personal robots are set to assist humans in their daily tasks. Assisted living is one of the major applications of personal assistive robots, where the robots will support health and wellbeing of the humans in need, especially elderly and disabled. Indoor environments are extremely challenging from a robot perception and navigation point of view, because of the ever-changing decorations, internal organizations and clutter. Furthermore, human-robot-interaction in personal assistive robots demands intuitive and human-like intelligence and interactions. Above challenges are aggravated by stringent and often tacit requirements surrounding personal privacy that may be invaded by continuous monitoring through sensors. Towards addressing the above problems, in this paper we present an architecture for "Ambient Intelligence" for indoor intelligent mobility by leveraging IoTs within a framework of Scalable Multi-layered Context Mapping Framework. Our objective is to utilize sensors in home settings in the least invasive manner for the robot to learn about its dynamic surroundings and interact in a human-like manner. The paper takes a semi-survey approach to presenting and illustrating preliminary results from our in-house built fully autonomous electric quadbike.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.