In this paper we address the problem of extracting vehicle 3D pose from 2D RGB images. An accurate methodology is presented that is capable of locating 3D coordinates of 20 pre-defined semantic vehicle points of interest or keypoints from 2D information. The presented two-step pipeline provides a straightforward way of extracting three-dimensional information from planar images and avoiding also the usage of other sensor that would lead to a more expensive and hard to manage system. The main contribution of this work is the presented dedicated network architectures that are able to locate simultaneously occluded and visible semantic points of interest to convert these 2D points into 3D space in a simple but efficient way. The presented method uses a robust network based on Stack-Hourglass architecture for precise prediction of semantic 2D keypoints from vehicles even if they are occluded. Furthermore, in the second step another dedicated network converts the 2D points into 3D world coordinates and therefore, the 3D pose of the vehicle can be automatically extracted, outperforming stateof-the-art techniques in terms of accuracy.
In this paper is presented a deep neural network architecture designed to run on a field-programmable gate array (FPGA) for detection vehicle on LIDAR point clouds. This works present a network based on VoxelNet adapted to run on an FPGA and to locate vehicles on point clouds from a 32 and a 64 channel optical sensor. For training the presented network the Kitti and Nuscenes dataset have been used. This work aims to motivate the usage of dedicated FPGA targets for training and validating neural network due to their accelerated computational capability compared to the well known GPUs. This platform also has some constraints that need to be assessed and taken care during development (limited memory e.g.). This research presents an implementation to overcome such limitations and obtain as good results as if a GPU would be used. This paper makes use of a state-of-the-art dataset such us Nuscenes which is formed by several sensors and provides seven time more annotations than the KITTI dataset of the 6 cameras, 5 radars and 1 Lidar it is formed by, all with full 360 degree field of view. The presented work proves realtime performance and good detection accuracy when moving part of the CNN presented in the proposed architecture to a commercial FPGA.
We describe a novel approach to algorithms concretization that extends the current mode of software visualization from computer screens to the real world. The method combines hands-on robotics and traditional algorithm visualization techniques to help diverse learners comprehend the basic idea of the given algorithm. From this point of view the robots interpret an algorithm while their internal program and external appearance determine the role they have in it. This gives us the possibility to bring algorithms into the real physical world where students can even touch the data structures during the execution. In the first version, we have concentrated on a few sorting algorithms as a proof-of-concept. Moreover, we have carried out an evaluation with 13-to-15-yearold students who used the concretization for gaining insight into one sorting algorithm. The preliminary results indicate that the tool can enhance learning. Now, our aim is to build an environment that supports both visualizations and robotics based concretizations of algorithms at the same time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.