Currently, single-stage point-based 3D object detection network remains underexplored. Many approaches worked on point cloud space without optimization and failed to capture the relationships among neighboring point sets. In this paper, we propose DCGNN, a novel single-stage 3D object detection network based on density clustering and graph neural networks. DCGNN utilizes density clustering ball query to partition the point cloud space and exploits local and global relationships by graph neural networks. Density clustering ball query optimizes the point cloud space partitioned by the original ball query approach to ensure the key point sets containing more detailed features of objects. Graph neural networks are very suitable for exploiting relationships among points and point sets. Additionally, as a single-stage 3D object detection network, DCGNN achieved fast inference speed. We evaluate our DCGNN on the KITTI dataset. Compared with the state-of-the-art approaches, the proposed DCGNN achieved better balance between detection performance and inference time.
It is a challenging problem to infer objects with reasonable shapes and appearance from a single picture. Existing research often pays more attention to the structure of the point cloud generation network, while ignoring the feature extraction of 2D images and reducing the loss in the process of feature propagation in the network. In this paper, a single-stage and single-view 3D point cloud reconstruction network, 3D-SSRecNet, is proposed. The proposed 3D-SSRecNet is a simple single-stage network composed of a 2D image feature extraction network and a point cloud prediction network. The single-stage network structure can reduce the loss of the extracted 2D image features. The 2D image feature extraction network takes DetNet as the backbone. DetNet can extract more details from 2D images. In order to generate point clouds with better shape and appearance, in the point cloud prediction network, the exponential linear unit (ELU) is used as the activation function, and the joint function of chamfer distance (CD) and Earth mover’s distance (EMD) is used as the loss function of 3DSSRecNet. In order to verify the effectiveness of 3D-SSRecNet, we conducted a series of experiments on ShapeNet and Pix3D datasets. The experimental results measured by CD and EMD have shown that 3D-SSRecNet outperforms the state-of-the-art reconstruction methods.
Traffic sign detection is a key part of intelligent assisted driving, but also a challenging task due to the small size and different scales of objects in foreground and closed range. In this paper, we propose a new traffic sign detection scheme: Spatial Pyramid Pooling and Adaptively Spatial Feature Fusion based Yolov3 (SPP and ASFF-Yolov3). In order to integrate the target detail features and environment context features in the feature extraction stage of Yolov3 network, the Spatial Pyramid Pooling module is introduced into the pyramid network of Yolov3. Additionally, Adaptively Spatial Feature Fusion module is added to the target detection phase of the pyramid network of Yolov3 to avoid the interference of different scale features with the process of gradient calculation. Experimental results show the effectiveness of the proposed SPP and ASFF-Yolov3 network, which achieves better detection results than the original Yolov3 network. It can archive real-time inference speed despite inferior to the original Yolov3 network. The proposed scheme will add an option to the solutions of traffic sign detection with real-time inference speed and effective detection results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.