2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2012
DOI: 10.1109/cvprw.2012.6239174
|View full text |Cite
|
Sign up to set email alerts
|

Gamesourcing to acquire labeled human pose estimation data

Abstract: In this paper, we present a gamesourcing method for automatically and rapidly acquiring labeled images of human poses to obtain ground truth data as input for human pose estimation from 2D images. Typically, these datasets are constructed manually through a tedious process of clicking on joint locations in images. By using a low-cost RGBD sensor, we capture synchronized, registered images, depth maps, and skeletons of users playing a movement-based game and automatically filter the data to keep a subset of uni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…Because the output of the Kinect is subject to noise and misinterpretation of the input, annotations of joints are not constantly correct and need to be checked manually. Souvenir et al [2012] solve this problem by automatic removal of frames with incorrect annotation of joints from the dataset. When one of the joint positions in the annotation is away from the body (they check it by the depth map automatically), they remove this frame from the dataset.…”
Section: Manual Processing Of the Annotation Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Because the output of the Kinect is subject to noise and misinterpretation of the input, annotations of joints are not constantly correct and need to be checked manually. Souvenir et al [2012] solve this problem by automatic removal of frames with incorrect annotation of joints from the dataset. When one of the joint positions in the annotation is away from the body (they check it by the depth map automatically), they remove this frame from the dataset.…”
Section: Manual Processing Of the Annotation Datamentioning
confidence: 99%
“…Shotton et al [2011] show that the size of the dataset really matters for this class of detection problems. Souvenir et al [2012] used directly the Kinect output as the ground-truth annotation and dropped the mis-detected frames. However, this might lead to chronic discarding of difficult poses and lighting/background conditions, which would lead to an incomplete dataset.…”
Section: Introductionmentioning
confidence: 99%
“…As with all crowdsourced data, it is important to validate that the data is reliable. In this article, which extends a small pilot on single game with a few users [Souvenir et al 2012], we describe how we collect and filter data from an RGB-D sensor during gameplay and compare our automatically generated data to manually curated datasets regularly used to train and test learning-based human pose estimation algorithms on 2D images.…”
Section: Introductionmentioning
confidence: 99%