Human Action Recognition is one of the key tasks in video understanding. Deep Convolutional Neural Networks (CNN) are often used for this purpose. Although they usually perform impressively, their decision interpretation remains challenging. We propose a novel visual CNN features understanding technique. Its objective is to find salient features that played a key role in decision making of the network. The technique only uses the features from the last convolutional layer before the fully connected layers of a trained model and builds an importance map of features. The map is propagated to the original frame thus highlighting the regions in them that contribute to the final decision. The method is fast as it does not require gradient computation as many state-of-the-art methods do. Proposed technique is applied to the Twin Spatio-Temporal 3D Convolutional Neural Network (TSTCNN), designed for Table Tennis Actions recognition. Features visualization is performed at the RGB and Optical flow branches of the network. Obtained results are compared to other visualization techniques both in terms of human understanding and similarity metrics. The metrics show that generated maps are similar to those obtained with known Grad-CAM method, e.g. Pearson Correlation Coefficient between the maps generated of RGB data for Grad-CAM and our method is 0.7 ± 0.05 and 0.72 ± 0.06 on Optical Flow data.
Edge detection is a method to detect presence of an object's image-typically this is identified by sharp changes in pixel density. We realized Canny Edge Detection Algorithm, the most optimal edge detector, in FPGA hardware utilizing Hardware-Software Co-Simulation with the help of Simulink (Mathworks) and System Generator (Xilinx). We explored and utilized different edge detection operators, in addition to Sobel, which is the typical such operator, for gradient calculation (the primary edge detection process). After comparative analysis, we found both Sobel and Robert operators among the best with hardware realization of Robert operator utilizing less resources (LUT & Flip-Flops). All the different versions of the algorithm was synthesized for Spartan-6 LX16 FPGAs from Xilinx.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.