We have presented a Spiking Convolutional Neural Network (SCNN) that incorporates retinal foveal-pit inspired Difference of Gaussian filters and rank-order encoding. The model is trained using a variant of the backpropagation algorithm adapted to work with spiking neurons, as implemented in the Nengo library. We have evaluated the performance of our model on two publicly available datasets -one for digit recognition task, and the other for vehicle recognition task. The network has achieved up to 90% accuracy, where loss is calculated using the cross-entropy function. This is an improvement over around 57% accuracy obtained with the alternate approach of performing the classification without any kind of neural filtering. Overall, our proof-of-concept study indicates that introducing biologically plausible filtering in existing SCNN architecture will work well with noisy input images such as those in our vehicle recognition task. Based on our results, we plan to enhance our SCNN by integrating lateral inhibition-based redundancy reduction prior to rank-ordering, which will further improve the classification accuracy by the network.
In-the-wild human pose estimation has a huge potential for various fields, ranging from animation and action recognition to intention recognition and prediction for autonomous driving. The current state-of-the-art is focused only on RGB and RGB-D approaches for predicting the 3D human pose. However, not using precise LiDAR depth information limits the performance and leads to very inaccurate absolute pose estimation. With LiDAR sensors becoming more affordable and common on robots and autonomous vehicle setups, we propose an end-to-end architecture using RGB and LiDAR to predict the absolute 3D human pose with unprecedented precision. Additionally, we introduce a weakly-supervised approach to generate 3D predictions using 2D pose annotations from PedX [1]. This allows for many new opportunities in the field of 3D human pose estimation.
Hyperspectral satellite imagery is now widely being used for accurate disaster prediction and terrain feature classification. However, in such classification tasks, most of the present approaches use only the spectral information contained in the images. Therefore, in this paper, we present a novel framework that takes into account both the spectral and spatial information contained in the data for land cover classification. For this purpose, we use the Gaussian Maximum Likelihood (GML) and Convolutional Neural Network (CNN) methods for the pixel-wise spectral classification and then, using segmentation maps generated by the Watershed algorithm, we incorporate the spatial contextual information into our model with a modified majority vote technique. The experimental analyses on two benchmark datasets demonstrate that our proposed methodology performs better than the earlier approaches by achieving an accuracy of 99.52% and 98.31% on the Pavia University and the Indian Pines datasets respectively. Additionally, our GML based approach, a non-deep learning algorithm, shows comparable performance to the stateof-the-art deep learning techniques, which indicates the importance of the proposed approach for performing a computationally efficient classification of hyperspectral imagery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.