Radar-camera Fusion for Road Target Classification

Aziz, Kheir-Eddine; Greef, Eddy De; Rykunov, Maxim; Bourdoux, André; Sahli, Hichem

doi:10.1109/radarconf2043947.2020.9266510

Cited by 25 publications

(11 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RGB-radar fusion through deep neural network (DNN) processing has already been studied in a number of works, using different fusion strategies. Among those, the authors in [22] proposed a road target classification and tracking system using a 79-GHz FMCW radar and a standard imaging camera. They adopt a late fusion approach, applying object recognition independently to the camera data using a YOLOv3 detector [23], and to the radar data, using a CNN-LSTM network.…”

Section: Related Workmentioning

confidence: 99%

“…Finally, the radar data must be projected from the range-azimuth view to the same perspective domain as the RGB and DVS images. As the drone altitude is not fixed, computing a homography between the radar rangeazimuth plane and the cameras directly as in [22] cannot be done since the radar is not a projective sensor. This means that, as the drone altitude changes (for fixed X, Y location), the radar detection map does not change as the radar cannot distinguish the objects elevation extents.…”

Section: A Input Pre-processingmentioning

confidence: 99%

“…Then it is sufficient to compute the homography matrix M using more than eight detection coordinates (x i , y i ) and their corresponding locations on the image planes, for different drone heights. We perform this step offline as an extrinsic calibration procedure using a corner reflector as in [22]. Then, we project the intermediate point coordinates (x i , y i ) on the image plane, and we extend their height vertically as done in [11] to better cover their spatial context, assuming a prior height of 1.5 m for all detections.…”

Section: A Input Pre-processingmentioning

confidence: 99%

“…Regarding PF4, system robustness to sensor failure and environmental conditions has been investigated by fusing standard imaging cameras with other sensing modalities such as radar [22] or Dynamic Vision Sensors [7] (DVS) (eventbased cameras), which asynchronously report the change in per-pixel brightness as a train of binary pulses with ∼ 1µs resolution. They provide a significantly higher dynamic range than standard cameras (∼ 140dB vs. ∼ 60dB) [7].…”

mentioning

confidence: 99%

See 3 more Smart Citations

Fail-Safe Human Detection for Drones Using a Multi-Modal Curriculum Learning Approach

Safa

Verbelen

Ocket

et al. 2022

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

Drones are currently being explored for safetycritical applications where human agents are expected to evolve in their vicinity. In such applications, robust people avoidance must be provided by fusing a number of sensing modalities in order to avoid collisions. Currently however, people detection systems used on drones are solely based on standard cameras besides an emerging number of works discussing the fusion of imaging and event-based cameras. On the other hand, radar-based systems provide up-most robustness towards environmental conditions but do not provide complete information on their own and have mainly been investigated in automotive contexts, not for drones. In order to enable the fusion of radars with both event-based and standard cameras, we present KUL-UAVSAFE, a first-of-its-kind dataset for the study of safety-critical people detection by drones.In addition, we propose a baseline CNN architecture with crossfusion highways and introduce a curriculum learning strategy for multi-modal data termed SAUL, which greatly enhances the robustness of the system towards hard RGB failures and provides a significant gain of 15% in peak F1 score compared to the use of BlackIn, previously proposed for cross-fusion networks. We demonstrate the real-time performance and feasibility of the approach by implementing the system in an edge-computing unit. We release our dataset and additional material in the project home page.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: A Input Pre-processingmentioning

confidence: 99%

Section: A Input Pre-processingmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Fail-Safe Human Detection for Drones Using a Multi-Modal Curriculum Learning Approach

Safa

Verbelen

Ocket

et al. 2022

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

show abstract

“…With insufficient accuracy in searching for vehicle objects, radar data is weak in object recognition, and it is difficult to integrate the advantages of the two types of data to improve the overall recognition performance. Aziz K et al [33] proposed a method of using 3D-CNN+LSTM to do MIMO radar data analysis, using YOLO algorithm to implement image object detection, and then using projection transformation to achieve result fusion. Because MIMO radar provides two-dimensional spatial data, it can implement object detection through convolutional neural networks.…”

Section: Related Workmentioning

confidence: 99%

A Novel Multi-Sensor Fusion Based Object Detection and Recognition Algorithm for Intelligent Assisted Driving

Liu

Liang

et al. 2021

IEEE Access

View full text Add to dashboard Cite

The object detection and recognition algorithm based on the fusion of millimeter-wave radar and high-definition video data can improve the safety of intelligent-driving vehicles effectively. However, due to the different data modalities of millimeter-wave radar and video, how to fuse the two effectively is the key point. The difficulty lies in the data fusion methods such as insufficient adaptability of image distortion in data alignment and coordinate transformation and also the mismatching of information levels of the data to be fused. To solve the problem of data fusion of millimeter wave radar and video, this paper proposes a decision-level fusion method of millimeter-wave radar and high-definition video data based on angular alignment. Specifically, through the joint calibration and approximate interpolation, projected to polar coordinate system, the radar and the camera are angularly aligned in the horizontal direction. Then objects are detected by a deep neural network model from video data, and combined with those detected by radar to make the joint decision. Finally, object detection and recognition task based on the fusion of the two kinds of data is completed. Theoretical analysis and experimental results indicate that the accuracy of the algorithm based on the two data fusion is superior to that of the single detection and recognition algorithm on the basis of millimeter-wave radar or video data.

show abstract