2021
DOI: 10.48550/arxiv.2108.07511
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic Segmentation

Lin Zhao,
Hui Zhou,
Xinge Zhu
et al.

Abstract: Camera and 3D LiDAR sensors have become indispensable devices in modern autonomous driving vehicles, where the camera provides the fine-grained texture, color information in 2D space and LiDAR captures more precise and farther-away distance measurements of the surrounding environments. The complementary information from these two sensors makes the two-modality fusion be a desired option. However, two major issues of the fusion between camera and LiDAR hinder its performance, i.e., how to effectively fuse these… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 47 publications
0
7
0
Order By: Relevance
“…Fusion has been studied for a number of LiDAR based 3D perception tasks in a supervised and weakly-supervised manner [1,4,18,21,41,42]. For LiDAR semantic segmentation PMF [44] and LIF-Seg [40] fuse the information from streams that process each modality individually to obtain higher information yielding features. However such approaches not only require image information during inference but also have linearly increasing memory and computation cost.…”
Section: Related Workmentioning
confidence: 99%
“…Fusion has been studied for a number of LiDAR based 3D perception tasks in a supervised and weakly-supervised manner [1,4,18,21,41,42]. For LiDAR semantic segmentation PMF [44] and LIF-Seg [40] fuse the information from streams that process each modality individually to obtain higher information yielding features. However such approaches not only require image information during inference but also have linearly increasing memory and computation cost.…”
Section: Related Workmentioning
confidence: 99%
“…LIF-Seg by Zhao et al [24] improves upon the LiDAR segmentation network Cylinder3D [25] through early-and middle-fusion with color images. Image patches around the projected points provide per-point color context for earlyfusion, while mid-fusion concatenates semantic features from LiDAR and image, processed with Cylinder3D and DeepLab v3+, respectively, before processing with an additional refinement sub-network based on Cylinder3D for final semantic labels.…”
Section: Related Workmentioning
confidence: 99%
“…It is aimed at establishing correspondences between instances from different modalities [3], either spatial or semantic. To achieve this, many methods use camera intrinsics to correspond spatial position of pixels and points, then align per pixel-point feature or fuse the raw [13,34,47] Image/Video 2D Heatmap Grounding [11,44,69] Point Cloud 3D Heatmap Grounding data [25,62,65,71,77,80]. Some works utilize depth information to project image features into 3D space and then fuse them with point-wise features [20,42,73,75,76].…”
Section: Image-point Cloud Cross-modal Learningmentioning
confidence: 99%