2011
DOI: 10.1007/978-3-642-19282-1_47
|View full text |Cite
|
Sign up to set email alerts
|

An Unsupervised Framework for Action Recognition Using Actemes

Abstract: Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human actions is not straightforward. In this paper, we study such an extension and propose an unsupervised framework to find phoneme-like units for actions, which we call actemes, using 3D data and without any prior assumptions. To this purpose, build on an earlier proposed framework in speech literature to automatically find actemes in the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…Kaiser ( MoPrim Dataset (Reng et al, 2005) 56 Krüger and Grest (2007) HMM Poses MoPrim Dataset (Reng et al, 2005) 57 Krüger and Herzog (2013) PHMM, Bayes Kulkarni et al (2011) k-means, one-pass DP decoding, HMM 3D Visual Hull IXMAS Dataset (Weinland et al, 2006) 61 Kulkarni et al (1989) Truth Weiz-mann Dataset (Blank et al, 2005) IXMAS Dataset (Weinland et al, 2006) Robust Dataset (Gorelick et al, 2007) 102 Roh et al (2010) VMT, PMT Weiz-mann Dataset (Blank et al, 2005) MSR-II Dataset (Cao et al, 2010) 106 Roshtkhari and Levine 2012Bag of video words, code book STV KTH Dataset (Schldt et al, 2004) Weiz-mann Dataset (Blank et al, 2005) MSR-II Dataset (Cao et al, 2010) 107 (Liu et al, 2009) Hol-lywood2 Dataset (Marszalek et al, 2009) HMDB Dataset (Kuehne et al, 2011) 112 Weizmann Dataset (Blank et al, 2005) UCF-Sports Dataset…”
Section: Resultsmentioning
confidence: 99%
“…Kaiser ( MoPrim Dataset (Reng et al, 2005) 56 Krüger and Grest (2007) HMM Poses MoPrim Dataset (Reng et al, 2005) 57 Krüger and Herzog (2013) PHMM, Bayes Kulkarni et al (2011) k-means, one-pass DP decoding, HMM 3D Visual Hull IXMAS Dataset (Weinland et al, 2006) 61 Kulkarni et al (1989) Truth Weiz-mann Dataset (Blank et al, 2005) IXMAS Dataset (Weinland et al, 2006) Robust Dataset (Gorelick et al, 2007) 102 Roh et al (2010) VMT, PMT Weiz-mann Dataset (Blank et al, 2005) MSR-II Dataset (Cao et al, 2010) 106 Roshtkhari and Levine 2012Bag of video words, code book STV KTH Dataset (Schldt et al, 2004) Weiz-mann Dataset (Blank et al, 2005) MSR-II Dataset (Cao et al, 2010) 107 (Liu et al, 2009) Hol-lywood2 Dataset (Marszalek et al, 2009) HMDB Dataset (Kuehne et al, 2011) 112 Weizmann Dataset (Blank et al, 2005) UCF-Sports Dataset…”
Section: Resultsmentioning
confidence: 99%
“…We will compare with this result with our experiments on this dataset. There are several related works on action recognition with IXMAS dataset, for example [23][24][25]43]. Gu et al [44] listed all state of art experimental results on this dataset.…”
Section: Resultsmentioning
confidence: 99%
“…A challenge of this solution is the inherent ambiguities between 2D image features and 3D poses. Some researchers use multiple-view videos [23][24][25], although single-view image sequences are more generic and easy to acquire. Moreover, recent work shows that even in monocular image sequences, reconstruction ambiguity can be tackled using regression methods like relevance vector machine (RVM) [26].…”
Section: Introductionmentioning
confidence: 99%
“…They use "strange attractors" to represent the dynamics of time series for action and dynamic texture synthesis, yet do not provide a way to compare two series of observations. Other approaches, inspired from speech and gesture recognition, represent actions as sequences of states [3,13,18,26] or use dynamic probabilistic graphical models [2,21,35,36] to model the temporal aspects of the videos. A limitation of these methods is that they only measure alignments between videos.…”
Section: Introductionmentioning
confidence: 99%