Navigational assistance aims to help visually-impaired people to ambulate the environment safely and independently. This topic becomes challenging as it requires detecting a wide variety of scenes to provide higher level assistive awareness. Vision-based technologies with monocular detectors or depth sensors have sprung up within several years of research. These separate approaches have achieved remarkable results with relatively low processing time and have improved the mobility of impaired people to a large extent. However, running all detectors jointly increases the latency and burdens the computational resources. In this paper, we put forward seizing pixel-wise semantic segmentation to cover navigation-related perception needs in a unified way. This is critical not only for the terrain awareness regarding traversable areas, sidewalks, stairs and water hazards, but also for the avoidance of short-range obstacles, fast-approaching pedestrians and vehicles. The core of our unification proposal is a deep architecture, aimed at attaining efficient semantic understanding. We have integrated the approach in a wearable navigation system by incorporating robust depth segmentation. A comprehensive set of experiments prove the qualified accuracy over state-of-the-art methods while maintaining real-time speed. We also present a closed-loop field test involving real visually-impaired users, demonstrating the effectivity and versatility of the assistive framework.
The use of RGB-Depth (RGB-D) sensors for assisting visually impaired people (VIP) has been widely reported as they offer portability, function-diversity and cost-effectiveness. However, polarization cues to assist traversability awareness without precautions against stepping into water areas are weak. In this paper, a polarized RGB-Depth (pRGB-D) framework is proposed to detect traversable area and water hazards simultaneously with polarization-color-depth-attitude information to enhance safety during navigation. The approach has been tested on a pRGB-D dataset, which is built for tuning parameters and evaluating the performance. Moreover, the approach has been integrated into a wearable prototype which generates a stereo sound feedback to guide visually impaired people (VIP) follow the prioritized direction to avoid obstacles and water hazards. Furthermore, a preliminary study with ten blindfolded participants suggests its effectivity and reliability.
It is very difficult for visually impaired people to perceive and avoid obstacles at a distance. To address this problem, the unified framework of multiple target detection, recognition and fusion is proposed based on the sensor fusion system comprised of a low-power MMW radar and an RGB-D sensor. In this paper, Mask R-CNN and SSD network are utilized to detect and recognize the objects from color images. The obstacles depth information is obtained from the depth images using the MeanShift algorithm. The position and velocity information of the multiple target are detected by the millimeter wave radar based on the principle of frequency modulated continuous wave. The data fusion based on the Particle Filter obtains more accurate state estimation and richer information by fusing the detection results from the color images, depth images and radar data compared with using only one sensor. The experiment results show that the data fusion enriches the detection results. Meanwhile, the effective detection range is expanded compared to using only the RGB-Depth sensor. Moreover, the data fusion results keep high accuracy and stability under diverse range and illumination conditions. As a wearable system, the sensor fusion system has the characteristics of versatility, portability and cost-effectiveness.
Localization systems play an important role in assisted navigation. Precise localization renders visually impaired people aware of ambient environments and prevents them from coming across potential hazards. The majority of visual localization algorithms, which are applied to autonomous vehicles, are not adaptable completely to the scenarios of assisted navigation. Those vehicle-based approaches are vulnerable to viewpoint, appearance and route changes (between database and query images) caused by wearable cameras of assistive devices. Facing these practical challenges, we propose Visual Localizer, which is composed of ConvNet descriptor and global optimization, to achieve robust visual localization for assisted navigation. The performance of five prevailing ConvNets are comprehensively compared, and GoogLeNet is found to feature the best performance on environmental invariance. By concatenating two compressed convolutional layers of GoogLeNet, we use only thousands of bytes to represent image efficiently. To further improve the robustness of image matching, we utilize the network flow model as a global optimization of image matching. The extensive experiments using images captured by visually impaired volunteers illustrate that the system performs well in the context of assisted navigation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.