“…Camera-LiDAR Fusion Cameras and LiDARs have complementary characteristics, facilitating many computer vision tasks, such as depth estimation [13,30,55], scene flow estimation [2,41], 3D object detection [10,27,36,45,51], etc. Some researchers [2,36,45,55] build a modular network and perform result-level fusion, while the others [13,27,30,41,51] explore feature-level fusion schemes including early-fusion and late-fusion. Instead, we propose a multi-stage and bidirectional fusion pipeline, which not only fully utilizes the characteristic of each modality, but maximizes the inter-modality complementarity as well.…”