Egocentric Activity Monitoring and Recovery

Behera, Ardhendu; Hogg, David C.; Cohn, Anthony G.

doi:10.1007/978-3-642-37431-9_40

Cited by 31 publications

(35 citation statements)

References 29 publications

(45 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However due to the high amount of discretisation of space and time, it is often not possible to discern similar looking activities that are performed at a different scale or speed. Quantitative spatial representations, on the other hand, are able to encode such finer motions in an activity as seen in previous work [35]. In our approach we make use of various quantitative spatial representations to aid our model in the recognition problem.…”

Section: Quantitative Spatial Representations (F 3 )mentioning

confidence: 99%

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Tayyub

Tavanai

Gatsoulis

et al. 2015

Computer Vision -- ACCV 2014

Self Cite

View full text Add to dashboard Cite

Abstract. For the effective operation of intelligent assistive systems working in real-world human environments, it is important to be able to recognise human activities and their intentions. In this paper we propose a novel approach to activity recognition from visual data. Our approach is based on qualitative and quantitative spatio-temporal features which encode the interactions between human subjects and objects in an abstract and efficient manner. Unlike current state of the art approaches, our approach uses significantly fewer assumptions and does not require any knowledge about object types, their affordances, or the sub-level activities that high-level activities consist of. We perform an automatic feature selection process which provides the most representative descriptions of the learnt activities. We validated our method using these descriptions on the CAD-120 benchmark dataset consisting of video sequences showing humans performing daily real-world activities. The experimental results show the strength of our work which significantly outperforms the current state of the art benchmark.

show abstract

Section: Quantitative Spatial Representations (F 3 )mentioning

confidence: 99%

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Tayyub

Tavanai

Gatsoulis

et al. 2015

Computer Vision -- ACCV 2014

Self Cite

View full text Add to dashboard Cite

show abstract

“…SURF (Bay et al, 2006)) in an image (top-left), 2) filtered keypoints based on their strength with assigned codewords using K-means clustering (top-right), 3) extraction of pairwise relations between keypoints belonging to the same codewords (middle row), 4) histogram of oriented pairwise relations (HOPR) representation of these extracted relations, which is used for framewise classification of activity using a classifier. The wrist marker in images are used for the detection and tracking of wrist in the existing method (Behera et al, 2012b). world activity recognition systems utilize a bag-ofvisual words paradigm, which use spatio-temporal features (Schuldt et al, 2004;Dollár et al, 2005;Blank et al, 2005;Ryoo and Aggarwal, 2009). These features are shown to be robust to the changes in lighting and invariant to affine transformations.…”

Section: Classifier Histogram Of Oriented Pairwise Relations (Hopr)mentioning

confidence: 99%

“…1. The proposed approach is in contrast to traditional approaches where interaction between objects and wrists are often used for recognising activities (Fathi et al, 2011a;Gupta and Davis, 2007;Behera et al, 2012b;Behera et al, 2012a). Such approaches use pre-trained object detectors.…”

Section: Classifier Histogram Of Oriented Pairwise Relations (Hopr)mentioning

confidence: 99%

Egocentric Activity Recognition using Histograms of Oriented Pairwise Relations

Behera

Chapman

Cohn

et al. 2014

Proceedings of the 9th International Conference on Computer Vision Theory and Applications

View full text Add to dashboard Cite

Abstract:This paper presents an approach for recognising activities using video from an egocentric (first-person view) setup. Our approach infers activity from the interactions of objects and hands. In contrast to previous approaches to activity recognition, we do not require to use an intermediate such as object detection, pose estimation, etc. Recently, it has been shown that modelling the spatial distribution of visual words corresponding to local features further improves the performance of activity recognition using the bag-of-visual words representation. Influenced and inspired by this philosophy, our method is based on global spatio-temporal relationships between visual words. We consider the interaction between visual words by encoding their spatial distances, orientations and alignments. These interactions are encoded using a histogram that we name the Histogram of Oriented Pairwise Relations (HOPR). The proposed approach is robust to occlusion and background variation and is evaluated on two challenging egocentric activity datasets consisting of manipulative task. We introduce a novel representation of activities based on interactions of local features and experimentally demonstrate its superior performance in comparison to standard activity representations such as bag-of-visual words.

show abstract

“…We evaluate our framework using an egocentric paradigm for recognizing complex manipulative tasks of assembling parts of a pump system in an industrial environment 1 . We compare our approach with our 1 Dataset and source code are available at www.engineering.leeds.ac.uk/ previous work in [1] which models the wrist-object and object-object interactions using qualitative and functional relationships. The accuracy of the proposed approach is 68.56% (using SIFT and STIP) and better than the method in [1], which is 52.09%.…”

mentioning

confidence: 99%

“…We compare our approach with our 1 Dataset and source code are available at www.engineering.leeds.ac.uk/ previous work in [1] which models the wrist-object and object-object interactions using qualitative and functional relationships. The accuracy of the proposed approach is 68.56% (using SIFT and STIP) and better than the method in [1], which is 52.09%. We also evaluated using bag-of-visual-features approach and the performance is 63.19%.…”

mentioning

confidence: 99%

Real-time Activity Recognition by Discerning Qualitative Relationships Between Randomly Chosen Visual Features

Behera¹,

Cohn²,

Hogg³

2014

Proceedings of the British Machine Vision Conference 2014

Self Cite

View full text Add to dashboard Cite

Motivation. Automatic recognition of human activities (or events) from video is important to many potential applications of computer vision. One of the most common approach is the bag-of-visual-features, which aggregate space-time features globally, from the entire video clip containing complete execution of a single activity. The bag-of-visual-features does not encode the spatio-temporal structure in the video. For this reason, there is a growing interest in modeling spatio-temporal structure between visual features in order to improve the results of activity recognition.The proposed framework. We model the spatio-temporal structure by exploiting the qualitative relationships between a pair of visual features. The proposed approach is inspired by [3,4]. The goal is to find a pair of visual features whose spatiotemporal relationships are discriminative enough, and temporally consistent for distinguishing various activities. The framework is applied to recognize activities from a continuous live video (egocentric view) of a person performing manipulative tasks in an industrial setup. In such environments, the purpose of activity recognition is to assist users by providing on-the-fly instructions from an automatic system that maintains an understanding of the on-going activities.In order to recognize activities in real-time, we propose a random forest with a discriminative Markov decision tree algorithm that considers a random subset of relational features at a time and Markov temporal structure that provides temporally smoothed output (Fig. 1). Our algorithm is different from conventional decision trees [2] and uses a linear SVM as a classifier at each nonterminal node and effectively explores temporal dependency at terminal nodes of the trees. We explicitly model the spatial relationships of left, right, top, bottom, very-near, near, far and very-far as well as temporal relationships of during, before and after between a pair of visual features (Fig. 2), which are selected randomly at the nonterminal nodes of a given Markov decision tree. Our hypothesis is that the proposed relationships are particularly suitable for detecting complex non-periodic manipulative tasks and can easily be applied to the existing visual descriptors such as SIFT, STIP, CUBOID and SURF.Growing discriminative Markov decision trees. Each tree is trained separately on a random subset of frames belonging to training videos. Learning proceeds recursively by splitting the training frames at internal nodes into the respective left and right subsets. This is done in the following four stages: randomly assign all frames from each activity class to a binary label; randomly sample a pair of visual words; compute the spatiotemporal relationships histogram h between them; and use a linear SVM to learn a binary split using the extracted h. The binary SVM at each internal node sends the frame to the left child if w T h ≤ 0 otherwise to the right child, where w is the set of weights learned through the linear SVM. Using an information gain criteria, ea...

show abstract

Egocentric Activity Monitoring and Recovery

Cited by 31 publications

References 29 publications

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Egocentric Activity Recognition using Histograms of Oriented Pairwise Relations

Real-time Activity Recognition by Discerning Qualitative Relationships Between Randomly Chosen Visual Features

Contact Info

Product

Resources

About