This paper presents a new method for 3D action recognition with skeleton
sequences (i.e., 3D trajectories of human skeleton joints). The proposed method
first transforms each skeleton sequence into three clips each consisting of
several frames for spatial temporal feature learning using deep neural
networks. Each clip is generated from one channel of the cylindrical
coordinates of the skeleton sequence. Each frame of the generated clips
represents the temporal information of the entire skeleton sequence, and
incorporates one particular spatial relationship between the joints. The entire
clips include multiple frames with different spatial relationships, which
provide useful spatial structural information of the human skeleton. We propose
to use deep convolutional neural networks to learn long-term temporal
information of the skeleton sequence from the frames of the generated clips,
and then use a Multi-Task Learning Network (MTLN) to jointly process all frames
of the generated clips in parallel to incorporate spatial structural
information for action recognition. Experimental results clearly show the
effectiveness of the proposed new representation and feature learning method
for 3D action recognition.Comment: CVPR 201
This paper presents a new representation of skeleton sequences for 3D action recognition. Existing methods based on hand-crafted features or recurrent neural networks cannot adequately capture the complex spatial structures and the long-term temporal dynamics of the skeleton sequences, which are very important to recognize the actions. In this paper, we propose to transform each channel of the 3D coordinates of a skeleton sequence into a clip. Each frame of the generated clip represents the temporal information of the entire skeleton sequence and one particular spatial relationship between the skeleton joints. The entire clip incorporates multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We also propose a multitask convolutional neural network (MTCNN) to learn the generated clips for action recognition. The proposed MTCNN processes all the frames of the generated clips in parallel to explore the spatial and temporal information of the skeleton sequences. The proposed method has been extensively tested on six challenging benchmark datasets. Experimental results consistently demonstrate the superiority of the proposed clip representation and the feature learning method for 3D action recognition compared to the existing techniques.
A highly efficient P-SSHI based rectifier for piezoelectric energy harvesting is presented in this paper. The proposed rectifier utilizes the voltages at the two ends of the piezoelectric device (PD) to detect the polarity change of the current produced by the PD. The inversion process of the voltage across the PD is automatically controlled by diodes along the oscillating network. In contrast to prior works, the proposed rectifier exhibits several advantages in terms of efficiency, circuit simplicity, compatibility with commercially available PDs, and standalone operation. Experimental results show that the proposed rectifier can provide a 5.8X boost in harvested energy compared to the conventional full wave bridge rectifier.
This letter presents SkeletonNet, a deep learning framework for skeleton-based 3D action recognition. Given a skeleton sequence, the spatial structure of the skeleton joints in each frame and the temporal information between multiple frames are two important factors for action recognition. We firstly extract body-part based features from each frame of the skeleton sequence. Compared to the original coordinates of the skeleton joints, the proposed features are translation, rotation and scale invariant. To learn robust temporal information, instead of treating the features of all frames as a time series, we transform the features into images and feed them to the proposed deep learning network which contains two parts, one to extract general features from the input images, while the other to generate a discriminative and compact representation for action recognition. The proposed method is tested on SBU kinect interaction dataset, CMU dataset and the large scale NTU RGB+D dataset and achieves state-of-the-art performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.