In a connected vehicle environment based on vehicle-to-vehicle (V2V) technology, images from front and ego vehicles are fused to augment a driver’s or autonomous system’s visual field, which is helpful in avoiding road accidents by eliminating the blind point (the objects occluded by vehicles), especially tailgating in urban areas. Realizing multi-view image fusion is a tough problem without knowing the relative location of two sensors and the fusing object is occluded in some views. Therefore, we propose an image geometric projection model and a new fusion method between neighbor vehicles in a cooperative way. Based on a 3D inter-vehicle projection model, selected feature matching points are adopted to estimate the geometric transformation parameters. By adding deep information, our method also designs a new deep-affine transformation to realize fusing of inter-vehicle images. Experimental results on KIITI (Karlsruhe Institute of Technology and Toyota Technological Institute) datasets are shown to validate our algorithm. Compared with previous work, our method improves the IoU index by 2~3 times. This algorithm can effectively enhance the visual perception ability of intelligent vehicles, and it will help to promote the further development and improvement of computer vision technology in the field of cooperative perception.
In this study, we present a three-dimensional (3D) object detection algorithm based on monocular images by constructing an end-to-end network, that incorporates depth information. The entire network consists of three parts. The first part includes the basic object detection neural network as the main body, that uses the region proposal network to obtain the two-dimensional (2D) region proposal of the object. The second part is the depth estimation branch network, that obtains the depth information of the object pixels and calculates the corresponding 3D point cloud. In the last part, concatenated features obtained from the aforementioned two parts are fed into the fully-connected layers. Subsequently, 2D and 3D detection results are obtained. Compared with certain existing methods, the accuracy of the detection results is improved in this study.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.