2019
DOI: 10.48550/arxiv.1902.04272
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Self-Supervised High Level Sensor Fusion

Qadeer Khan,
Torsten Schön,
Patrick Wenzel

Abstract: In this paper, we present a framework to control a self-driving car by fusing raw information from RGB images and depth maps. A deep neural network architecture is used for mapping the vision and depth information, respectively, to steering commands. This fusion of information from two sensor sources allows to provide redundancy and fault tolerance in the presence of sensor failures. Even if one of the input sensors fails to produce the correct output, the other functioning sensor would still be able to maneuv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 4 publications
0
4
0
Order By: Relevance
“…Once it was trained, the latent semantic vector obtained from the encoder output was fused with the depth features obtained through a separate CNN features extractor. The fusion architecture proposed by [24] is similar to the gating mechanism driven by the learned scalar weights presented in [25]. The method proposed in [26] is closest to our work.…”
Section: B Sensor Fusionmentioning
confidence: 90%
See 1 more Smart Citation
“…Once it was trained, the latent semantic vector obtained from the encoder output was fused with the depth features obtained through a separate CNN features extractor. The fusion architecture proposed by [24] is similar to the gating mechanism driven by the learned scalar weights presented in [25]. The method proposed in [26] is closest to our work.…”
Section: B Sensor Fusionmentioning
confidence: 90%
“…Here, we also discuss some fusion strategies which were originally proposed for applications other than object de-tection but are relevant to our work. In [24], the authors proposed a sensor fusion methodology for RGB and depth images to steer a self-driving vehicle. A semantic segmentation network was trained using RGB images by employing an encoder-decoder architecture without skip connections.…”
Section: B Sensor Fusionmentioning
confidence: 99%
“…In this case, the information fusion is done by a mid-level approach; in particular, before fusion, RGB images are used to generate a semantic segmentation which correspond to one of the information streams reaching the fusion layers, and there are two more independent streams based on LiDAR, one encoding a bird view and the other a polar grid mapping. Khan et al [96] also used CARLA to propose an end-to-end driving CNN based on RGB and depth images, which predicts only the steering angle, assuming that neither other vehicles nor pedestrians are present. In a first step, the CNN is trained only using depth information (taken as the Z-buffer produced by UE4 2 , the game engine behind CARLA).…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, it is worth to explore multimodality in the context of end-toend driving models. However, the literature on this topic is still scarce [95], [96]. In fact, a survey paper on multimodal object detection and semantic segmentation, appeared during the elaboration of this paper, states that multimodal end-toend learning and direct perception (which refers to what we term here as multimodal end-to-end driving) is still an open question (see [97], C.2).…”
mentioning
confidence: 99%