2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) 2011
DOI: 10.1109/iccvw.2011.6130272
|View full text |Cite
|
Sign up to set email alerts
|

Measuring and reducing observational latency when recognizing actions

Abstract: An important aspect in interactive, action-based interfaces is the latency in recognizing the action. High latency will cause the system's feedback to lag behind user actions, reducing the overall quality of the user experience. This paper presents a novel dataset and algorithms for reducing the latency in recognizing the action. Latency in classification is minimized with a classifier based on logistic regression that uses canonical poses to identify the action. The classifier is trained from the dataset usin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
27
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(27 citation statements)
references
References 18 publications
0
27
0
Order By: Relevance
“…Both papers limit input to color and depth maps. Only Masood et al [19] and Sung et al [32] use joint sequences from depth sensors as a feature. In [19], only skeleton joints are used as a feature for real-time single activity recognition and actions are detected by logistic regression.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Both papers limit input to color and depth maps. Only Masood et al [19] and Sung et al [32] use joint sequences from depth sensors as a feature. In [19], only skeleton joints are used as a feature for real-time single activity recognition and actions are detected by logistic regression.…”
Section: Related Workmentioning
confidence: 99%
“…These datasets are focused on simple activities or gestures [19,16,2], or daily activities [22,32] performed by a single actor such as drinking water, cooking, entering the room, etc.…”
Section: Related Workmentioning
confidence: 99%
“…From skeletal joint locations, Yang and Tian [27] used the relative differences of joint positions, which is a compact representation of the structure of the skeleton for actions involving multiple joints. In a number of existing approaches, the displacement of joint position between the current frame and other frames is calculated in order to capture the joint motion across time [28][29][30]. There are also some techniques associating features extracted from depth data with 3D joint data for action recognition that involves interaction with the environment [31,32].…”
Section: Previous Workmentioning
confidence: 99%
“…A number of other approaches combine different types of joint features [28][29][30]32] but as all skeleton joints in each frame are involved, noise introduced by unrelated joints during specific actions may degrade the classification result. Furthermore, since they mostly use direct concatenation of relative positions of joints in combination with joint displacements across time, noise introduced in one feature will impact the discriminative power of the entire feature vector.…”
Section: Previous Workmentioning
confidence: 99%
“…Representations based directly on raw joint positions are widely used due to the simple acquisition from sensors. Although normalization procedures can make human representations partially invariant to view and scale variations, more [210] Moving Pose BoW Lowlv Dict Bloom et al [211] Dynamic Features Conc Lowlv Hand Vemulapalli et al [212] Lie Group Manifold Conc Manif Hand Zhang and Parker [213] BIPOD Stat Body Hand Lv and Nevatia [214] HMM/Adaboost Conc Lowlv Hand Herda et al [215] Quaternions Conc Body Hand Negin et al [216] RDF Kinematic Features Conc Lowlv Unsup Masood et al [217] Logistic Regression Conc Lowlv Hand Meshry et al [218] Angle & Moving Pose BoW Lowlv Unsup Tao and Vidal [219] Moving Poselets BoW Body Dict Eweiwi et al [220] Discriminative Action Features Conc Lowlv Unsup Wang et al [221] Ker-RP Stat Lowlv Hand Salakhutdinov et al [222] HD Models Conc Lowlv Deep sophisticated construction techniques (e.g., deep learning) are typically needed to develop robust human representations. Representations without involving temporal information are suitable to address problems such as pose and gesture recognition.…”
Section: Discussionmentioning
confidence: 99%