Object recognition, which mainly includes object detection and semantic segmentation, is one of the critical challenges for intelligent vehicles. In most cases, cameras and Lidar are the most common sensors used for object recognition. However, both cameras and Lidar suffer from some inherent drawbacks. Therefore, the fusion of camera and Lidar becomes a natural solution to overcome the inherent defects of each single sensing modality. With the boost of deep learning-based algorithms, multi-sensor fusion methodologies employ deep-learning methods as their fusion strategy, which has made impressive accomplishments on large-scale objects such as vehicles and buses. However, most existing sensor-fusion strategies have the problem of ignoring detailed information caused by down-sampling operations in deep learning, which results in poor detection performance on small-scale objects such as pedestrians and cyclists. In this paper, we propose a real-time multi-sensor (Lidar and color camera) fusion strategy for multi-scale object recognition at the semantic level named Enet-CRF-Lidar. Firstly, a multi-module Enet is designed to adapt both large-scale objects and small-scale ones. Then, the CRF-RNN module is integrated with the multi-module Enet to introduce the low-level details of the input data, which leads to a significant improvement in small-scale object recognition. The experimental results show that the proposed Enet-CRF-Lidar module can provide reliable detection performance on multi-scale objects and can be adapted to complex scenarios.
LiDAR-based semantic segmentation, particularly for unstructured environments, plays a crucial role in environment perception and driving decisions for unmanned ground vehicles. Unfortunately, chaotic unstructured environments, especially the high-proportion drivable areas and large-area static obstacles therein, inevitably suffer from the problem of blurred class edges. Existing published works are prone to inaccurate edge segmentation and have difficulties dealing with the above challenge. To this end, this paper proposes a real-time edge-guided LiDAR semantic segmentation network for unstructured environments. First, the main branch is a lightweight architecture that extracts multi-level point cloud semantic features; Second, the edge segmentation module is designed to extract high-resolution edge features using cascaded edge attention blocks, and the accuracy of extracted edge features and the consistency between predicted edge and semantic segmentation results are ensured by additional supervision; Third, the edge guided fusion module fuses edge features and main branch features in a multi-scale manner and recalibrates the channel feature using channel attention, realizing the edge guidance to semantic segmentation and further improving the segmentation accuracy and adaptability of the model. Experimental results on the SemanticKITTI dataset, the Rellis-3D dataset, and on our test dataset demonstrate the effectiveness and real-time performance of the proposed network in different unstructured environments. Especially, the network has state-of-the-art performance in segmentation of drivable areas and large-area static obstacles in unstructured environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.