2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.226
|View full text |Cite
|
Sign up to set email alerts
|

Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions

Abstract: Hands appear very often in egocentric video, and their appearance and pose give important cues about what people are doing and what they are paying attention to. But existing work in hand detection has made strong assumptions that work well in only simple scenarios, such as with limited interaction with other people or in lab settings. We develop methods to locate and distinguish between hands in egocentric video using strong appearance models with Convolutional Neural Networks, and introduce a simple candidat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

5
340
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 365 publications
(346 citation statements)
references
References 29 publications
(50 reference statements)
5
340
0
1
Order By: Relevance
“…We collected a dataset of first-person video from interacting subjects [1], using Google Glass to capture video (720p30) from each person’s viewpoint, as illustrated in Figure 1. The subjects were asked to perform four different activities: (1) playing cards; (2) playing fast chess; (3) solving a jigsaw puzzle; and (4) playing Jenga (a 3d puzzle game).…”
Section: Hand Interactionsmentioning
confidence: 99%
See 2 more Smart Citations
“…We collected a dataset of first-person video from interacting subjects [1], using Google Glass to capture video (720p30) from each person’s viewpoint, as illustrated in Figure 1. The subjects were asked to perform four different activities: (1) playing cards; (2) playing fast chess; (3) solving a jigsaw puzzle; and (4) playing Jenga (a 3d puzzle game).…”
Section: Hand Interactionsmentioning
confidence: 99%
“…We briefly describe the approach here; more details, as well as an in-depth quantitative evaluation, are presented elsewhere [1]. The hand extraction process consists of two major steps: detection, which tries to coarsely locate hands in each frame, and segmentation, which estimates the fine-grained pixel-level shape of each hand.…”
Section: Hand Interactionsmentioning
confidence: 99%
See 1 more Smart Citation
“…[23,20] reason about state changes in household objects, and [17,54] reason about human-object interactions. Adding mid-level cues such as face, gaze, and hands has also been investigated by [37,19,18,49,6]. Hybrid approaches [43,38,56] utilize both object and motion information.…”
Section: Related Workmentioning
confidence: 99%
“…According to survey of [11], the most commonly explored objective of egocentric vision is object recognition and tracking. Furthermore, hands are among the most common objects in the user's field of view, and a proper detection, localization, and tracking could be a main input for other objectives, such as gesture recognition, understanding hand-object interactions, and activity recognition [5,[12][13][14][15][16][17][18][19][20]. Recently, egocentric pixel-level hand detection has attracted more and more attention.…”
Section: Related Workmentioning
confidence: 99%