2016 IEEE International Conference on Robotics and Automation (ICRA) 2016
DOI: 10.1109/icra.2016.7487370
|View full text |Cite
|
Sign up to set email alerts
|

Fusing LIDAR and images for pedestrian detection using convolutional neural networks

Abstract: In this paper, we explore various aspects of fusing LIDAR and color imagery for pedestrian detection in the context of convolutional neural networks (CNNs), which have recently become state-of-art for many vision problems. We incorporate LIDAR by up-sampling the point cloud to a dense depth map and then extracting three features representing different aspects of the 3D scene. We then use those features as extra image channels. Specifically, we leverage recent work on HHA [9] (horizontal disparity, height above… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
64
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 117 publications
(69 citation statements)
references
References 12 publications
0
64
0
Order By: Relevance
“…For example, in [141], the image and the depth map went through two separated CNN networks, and only the feature vectors from the last layer were concatenated to jointly carry out the final detection task. In [142], the point cloud was first converted into a three-channel HHA map (which contains Horizontal disparity, Height above ground, and Angle). The HHA and RGB (Red-Green-Blue color channel) images went through two different CNN networks as well but the author found that the fusion should be done at the early to middle layers of the CNN instead of the last layer.…”
Section: Fusionmentioning
confidence: 99%
“…For example, in [141], the image and the depth map went through two separated CNN networks, and only the feature vectors from the last layer were concatenated to jointly carry out the final detection task. In [142], the point cloud was first converted into a three-channel HHA map (which contains Horizontal disparity, Height above ground, and Angle). The HHA and RGB (Red-Green-Blue color channel) images went through two different CNN networks as well but the author found that the fusion should be done at the early to middle layers of the CNN instead of the last layer.…”
Section: Fusionmentioning
confidence: 99%
“…Many works [61], [91], [99]- [101], [106], [108], [109], [111], [117]- [120], [123] deal with the 2D object detection problem on the front-view 2D image plane. Compared to 2D detection, 3D detection is more challenging since the object's distance to the ego-vehicle needs to be estimated.…”
Section: ) 2d or 3d Detectionmentioning
confidence: 99%
“…Eitel et al [13] proposed to carry out objection recognition by fusing depth maps and color images with a CNN. In [14], LIDAR point clouds were transformed into their HHA (horizontal disparity, height above the ground, and angle) representation [15] and then combined with RGB images using a variety of CNN fusion strategies for performing pedestrian detection. More recently, Asvadi et al [16] developed a system for vehicle detection that integrates LIDAR and color camera data within a deep learning framework.…”
Section: Related Workmentioning
confidence: 99%