DeepCAMP: Deep Convolutional Action &amp; Attribute Mid-Level Patterns

Diba, Ali; Pazandeh, Ali Mohammad; Pirsiavash, Hamed; Gool, Luc Van

doi:10.1109/cvpr.2016.387

Cited by 32 publications

(14 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…1). This is in contrast to existing methods that typically extract either global representations for the entire image [6,45,46,7] or video sequence [38,16], thus not focusing on the action itself, or localize the feature extraction process to the action itself via dense trajectories [43,42,9], optical flow [8,45,20] or actionness [44,3,14,48,39,24], thus failing to exploit contextual information. To the best of our knowledge, only two-stream networks [30,8,4,22] have attempted to jointly leverage both information types by making use of RGB frames in conjunction with optical flow to localize the action.…”

Section: Introductionmentioning

confidence: 90%

See 1 more Smart Citation

Encouraging LSTMs to Anticipate Actions Very Early

Aliakbarian

Saleh

Salzmann

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

158

205

View full text Add to dashboard Cite

In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos. As such, it is therefore key to the success of computer vision applications requiring to react as early as possible, such as autonomous navigation. In this paper, we propose a new action anticipation method that achieves high prediction accuracy even in the presence of a very small percentage of a video sequence. To this end, we develop a multi-stage LSTM architecture that leverages context-aware and action-aware features, and introduce a novel loss function that encourages the model to predict the correct class as early as possible. Our experiments on standard benchmark datasets evidence the benefits of our approach; We outperform the state-of-the-art action anticipation methods for early prediction by a relative increase in accuracy of 22.0% on JHMDB-21, 14.0% on UT-Interaction and 49.9% on UCF-101.

show abstract

Section: Introductionmentioning

confidence: 90%

“…Most recent action approaches extract global representations for the entire image [6,46,7] or video sequence [38,16]. As such, these methods do not truly focus on the actions of interest, but rather compute a context-aware representation.…”

Section: Action Modelingmentioning

confidence: 99%

Encouraging LSTMs to Anticipate Actions Very Early

Aliakbarian

Saleh

Salzmann

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

158

205

View full text Add to dashboard Cite

show abstract

“…[119] focuses on the changes that an action brings into the environment and propose a siamese CNN architecture to fuse precondition and effect information from the environment. [20] proposes a CNN which uses mid-level discriminative visual elements. The method, called DeepPattern, is able to learn discriminative patches by exploring human body parts as well as scene context.…”

Section: Deep Learning With Fusion Strategiesmentioning

confidence: 99%

A Survey on Deep Learning Based Approaches for Action and Gesture Recognition in Image Sequences

Asadi-Aghbolaghi

Clapés

Bellantonio

et al. 2017

2017 12th IEEE International Conference on Automatic Face &Amp; Gesture Recognition (FG 2017)

169

View full text Add to dashboard Cite

The interest in action and gesture recognition has grown considerably in the last years. In this paper, we present a survey on current deep learning methodologies for action and gesture recognition in image sequences. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. We review the details of the proposed architectures, fusion strategies, main datasets, and competitions. We summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, discussing their main features and identify opportunities and challenges for future research.

show abstract

“…Caffe has been also used to implement 3D-CNN for action recognition (Tran et al, 2015;Poleg et al, 2016;Shou et al, 2016b;Wang et al, 2016d;Singh et al, 2016b), and motionbased approaches for both action (Simonyan and Zisserman, 2014;Singh et al, 2016a;Gkioxari and Malik, 2015) and gesture recognition (Wu et al, 2016b;Wang et al, 2017. Caffe is preferred to other frameworks for its speed and efficiency, especially in "fused" architectures for action recognition (Singh et al, 2016b;Deng et al, 2015;Diba et al, 2016;Peng and Schmid, 2016). Popular network types like FNN, CNN, LSTM, and RNN are fully supported by CNTK (Yu et al, 2014), which was started by speech processing researchers.…”

Section: Platformsmentioning

confidence: 99%

Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey

Asadi-Aghbolaghi

Clapés

Bellantonio

et al. 2017

Gesture Recognition

View full text Add to dashboard Cite

Interest in automatic action and gesture recognition has grown considerably in the last few years. This is due in part to the large number of application domains for this type of technology. As in many other computer vision areas, deep learning based methods have quickly become a reference methodology for obtaining state-of-the-art performance in both tasks. This chapter is a survey of current deep learning based methodologies for action and gesture recognition in sequences of images. The survey reviews both fundamental and cutting edge methodologies reported in the last few years. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. Details of the proposed architectures, fusion strategies, main datasets, and competitions are reviewed. *. A reduced version of this appeared appeared as: M. Asadi-Aghbolaghi et al. A survey on deep learning based approaches for action and gesture recognition in image sequences. In

show abstract

DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns

Cited by 32 publications

References 33 publications

Encouraging LSTMs to Anticipate Actions Very Early

Encouraging LSTMs to Anticipate Actions Very Early

A Survey on Deep Learning Based Approaches for Action and Gesture Recognition in Image Sequences

Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey

Contact Info

Product

Resources

About