Internal Transfer Learning for Improving Performance in Human Action Recognition for Small Datasets

Wang, Tian; Chen, Yang; Zhang, Mingjie; Chen, Jie; Snoussi, Hichem

doi:10.1109/access.2017.2746095

Cited by 50 publications

(28 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, the fusion at the decision level is superior to fusion at the feature level, and the performance outperforms most of the compared methods. (3) The method in [28] obtained 96.1% accuracy, which outperforms our method because of the effective internal transfer learning (ITL) strategy proposed in [28]. In addition, the performance in [30] is also better than our method since it employs a temporal network like long short-term memory (LSTM).…”

Section: Comparison With Other Methodsmentioning

confidence: 81%

“…Table 3 gives the compared results of these different methods. From Table 3, the following conclusions can be made: (1) In view of the performance comparison of a single deep model, the proposed 3D DenseNet model named D40-3D-DenseNet achieves the best performance compared with other models such as 3D CNN (LRN + HRN) [27], C3D [28], md3D CNN [29], 3D CNN [30], and Faster R-CNN [31], which means the proposed 3D DenseNet model can better represent the spatiotemporal motion patterns than other models. The reason is that it makes the best use of features through dense connections.…”

Section: Comparison With Other Methodsmentioning

confidence: 94%

“…The proposed method is first compared with two existing methods on VIVA dataset. Note that the first compared method in [24] relies on handcrafted feature representations for gesture classification, while the other compared method in [27][28][29][30][31] uses 3D CNN to extract gesture features. Table 3 gives the compared results of these different methods.…”

Section: Comparison With Other Methodsmentioning

confidence: 99%

See 2 more Smart Citations

Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition

et al. 2019

View full text Add to dashboard Cite

Gesture recognition has been applied in many fields as it is a natural human–computer communication method. However, recognition of dynamic gesture is still a challenging topic because of complex disturbance information and motion information. In this paper, we propose an effective dynamic gesture recognition method by fusing the prediction results of a two-dimensional (2D) motion representation convolution neural network (CNN) model and three-dimensional (3D) dense convolutional network (DenseNet) model. Firstly, to obtain a compact and discriminative gesture motion representation, the motion history image (MHI) and pseudo-coloring technique were employed to integrate the spatiotemporal motion sequences into a frame image, before being fed into a 2D CNN model for gesture classification. Next, the proposed 3D DenseNet model was used to extract spatiotemporal features directly from Red, Green, Blue (RGB) gesture videos. Finally, the prediction results of the proposed 2D and 3D deep models were blended together to boost recognition performance. The experimental results on two public datasets demonstrate the effectiveness of our proposed method.

show abstract

Section: Comparison With Other Methodsmentioning

confidence: 81%

Section: Comparison With Other Methodsmentioning

confidence: 94%

Section: Comparison With Other Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition

et al. 2019

View full text Add to dashboard Cite

show abstract

“…TensorFlow [1], a deep learning framework, was used for transfer learning which is the concept of using a pretrained CNN and retraining the penultimate layer that does classification before the output. This type of learning is ideal for this study due to our relatively small dataset [22]. The results of the retraining process can be viewed using the suite of visualisation tools on TensorBoard.…”

Section: The Network Infrastructurementioning

confidence: 99%

A Machine Vision Approach to Human Activity Recognition using Photoplethysmograph Sensor Data

Brophy

Dominguez

Wang

et al. 2018

2018 29th Irish Signals and Systems Conference (ISSC)

View full text Add to dashboard Cite

The current gold standard for human activity recognition (HAR) is based on the use of cameras. However, the poor scalability of camera systems renders them impractical in pursuit of the goal of wider adoption of HAR in mobile computing contexts. Consequently, researchers instead rely on wearable sensors and in particular inertial sensors. A particularly prevalent wearable is the smart watch which due to its integrated inertial and optical sensing capabilities holds great potential for realising better HAR in a non-obtrusive way. This paper seeks to simplify the wearable approach to HAR through determining if the wrist-mounted optical sensor alone typically found in a smartwatch or similar device can be used as a useful source of data for activity recognition. The approach has the potential to eliminate the need for the inertial sensing element which would in turn reduce the cost of and complexity of smartwatches and fitness trackers. This could potentially commoditise the hardware requirements for HAR while retaining the functionality of both heart rate monitoring and activity capture all from a single optical sensor. Our approach relies on the adoption of machine vision for activity recognition based on suitably scaled plots of the optical signals. We take this approach so as to produce classifications that are easily explainable and interpretable by non-technical users. More specifically, images of photoplethysmography signal time series are used to retrain the penultimate layer of a convolutional neural network which has initially been trained on the ImageNet database. We then use the 2048 dimensional features from the penultimate layer as input to a support vector machine. Results from the experiment yielded an average classification accuracy of 92.3%. This result outperforms that of an optical and inertial sensor combined (78%) and illustrates the capability of HAR systems using standalone optical sensing elements which also allows for both HAR and heart rate monitoring. Finally, we demonstrate through the use of tools from research in explainable AI how this machine vision approach lends itself to more interpretable machine learning output.

show abstract

“…However, due to the occlusion and appearance change, tracking method remains a challenging problem. 6 Feature descriptors, such as co-occurrence matrix, 7 pixel change history, 8 mixture of dynamic texture, 9 histograms of oriented swarms with histograms of gradients, 10 convolutional neural network, 11 were proposed for event analysis. These methods relied on the semantic segmentation performance to a certain extent.…”

Section: Introductionmentioning

confidence: 99%

Abnormal global and local event detection in compressive sensing domain

et al. 2018

Self Cite

View full text Add to dashboard Cite

Abnormal event detection, also known as anomaly detection, is one challenging task in security video surveillance. It is important to develop effective and robust movement representation models for global and local abnormal event detection to fight against factors such as occlusion and illumination change. In this paper, a new algorithm is proposed. It can locate the abnormal events on one frame, and detect the global abnormal frame. The proposed algorithm employs a sparse measurement matrix designed to represent the movement feature based on optical flow efficiently. Then, the abnormal detection mission is constructed as a one-class classification task via merely learning from the training normal samples. Experiments demonstrate that our algorithm performs well on the benchmark abnormal detection datasets against state-of-the-art methods.

show abstract

Internal Transfer Learning for Improving Performance in Human Action Recognition for Small Datasets

Cited by 50 publications

References 29 publications

Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition

Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition

A Machine Vision Approach to Human Activity Recognition using Photoplethysmograph Sensor Data

Abnormal global and local event detection in compressive sensing domain

Contact Info

Product

Resources

About