2022
DOI: 10.48550/arxiv.2207.00186
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MMFN: Multi-Modal-Fusion-Net for End-to-End Driving

Abstract: Inspired by the fact that humans use diverse sensory organs to perceive the world, sensors with different modalities are deployed in end-to-end driving to obtain the global context of the 3D scene. In previous works, camera and LiDAR inputs are fused through transformers for better driving performance. These inputs are normally further interpreted as high-level map information to assist navigation tasks. Nevertheless, extracting useful information from the complex map input is challenging, for redundant inform… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 23 publications
(46 reference statements)
0
3
0
Order By: Relevance
“…Firstly, for sensor fusion framework, conditional imitation learning (CIL) (Codevilla et al, 2018) provides a structure of sensor data fusion policy, high-level commands (human intention or planned route) are used for channel switching. This framework is widely used in the most of the imitation learning-based models (Codevilla et al, 2018;Wang et al, 2019;Liang et al, 2018;Sobh et al, 2018;Ma et al, 2020;Huang et al, 2021;Zhang et al, 2022). Secondly, except the CIL structure used in most of researches, the high-level command can also be considered as an additional input observation to enhance the model ability of steering estimation (Codevilla et al, 2018;Wang et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Firstly, for sensor fusion framework, conditional imitation learning (CIL) (Codevilla et al, 2018) provides a structure of sensor data fusion policy, high-level commands (human intention or planned route) are used for channel switching. This framework is widely used in the most of the imitation learning-based models (Codevilla et al, 2018;Wang et al, 2019;Liang et al, 2018;Sobh et al, 2018;Ma et al, 2020;Huang et al, 2021;Zhang et al, 2022). Secondly, except the CIL structure used in most of researches, the high-level command can also be considered as an additional input observation to enhance the model ability of steering estimation (Codevilla et al, 2018;Wang et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…However, this pipeline suffers from hand-tuned parameters, complex intermediate representations and other drawbacks. To alleviate the above challenges, a fully end-to-end fashion with imitation learning or reinforcement learning has been more and more popular in these years (Liang et al, 2018;Gao et al, 2017;Ma et al, 2020;Zhang et al, 2022), which exploits the potential of learning technologies and can achieve comparable performance with human drivers. Various learning-based end-to-end navigation approaches are presented in the literature (Codevilla et al, 2018;Liang et al, 2018;Gao et al, 2017;Ma et al, 2020;Huang et al, 2021), which directly learn the relationship between the surrounding environment and raw sensor observation data of the vehicle and its control policy, and have provided practical demonstrations of learning-based end-toend policy trained in real vehicles (Cui et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation