2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images

Eichner, Marcin; Marín-Jiménez, Manuel J.; Zisserman, Andrew; Ferrari, Vittorio

doi:10.1007/s11263-012-0524-9

Cited by 228 publications

(245 citation statements)

References 46 publications

(117 reference statements)

Supporting

Mentioning

244

Contrasting

Order By: Relevance

“…To avoid detecting the same person twice, each face detection is then regressed into the coordinate frame of the upper-body detector and suppressed if it overlaps substantially with any upper-body detection. As shown in [6] this combination yields a higher detection rate at the same false-positive rate, compared to using either component alone.…”

Section: Upper Body Detectionmentioning

confidence: 82%

“…Here, we use the publicly available detector of [6]. It combines an upper-body detector based on the part-based model of [8] and a face detector [18].…”

Section: Upper Body Detectionmentioning

confidence: 99%

“…Table 2 gives the percentage of samples where pose is estimated accurately, i.e. the CPC between the estimated and ground-truth stickmen is < 0.3 In all cases, we use the implementations provided by the authors [5,16,2,22] and all methods are given the same detection windows [6] as preprocessing. Both measures agree on the relative ranking of the methods: Yang and Ramanan [22] performs best, followed by Sapp et al [16], Eichner and Ferrari [5] and then by Andriluka et al [2].…”

Section: Datasetsmentioning

confidence: 99%

“…The performance is evaluated under two regimes: (A) only where the predicted HPE corresponds to one of the annotations. Since the images are fairly completely annotated, any upper body detection window [6] which does correspond to an annotation is considered a false positive. In this regime such false positives are ignored; (B) all predictions are evaluated, including false-positives.…”

Section: Assessment Of the Pose Quality Evaluatormentioning

confidence: 99%

See 3 more Smart Citations

Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators

Jammalamadaka

Zisserman

Eichner

et al. 2012

Computer Vision – ECCV 2012

View full text Add to dashboard Cite

Abstract. Most current vision algorithms deliver their output 'as is', without indicating whether it is correct or not. In this paper we propose evaluator algorithms that predict if a vision algorithm has succeeded. We illustrate this idea for the case of Human Pose Estimation (HPE). We describe the stages required to learn and test an evaluator, including the use of an annotated ground truth dataset for training and testing the evaluator (and we provide a new dataset for the HPE case), and the development of auxiliary features that have not been used by the (HPE) algorithm, but can be learnt by the evaluator to predict if the output is correct or not. Then an evaluator is built for each of four recently developed HPE algorithms using their publicly available implementations: Eichner and Ferrari [5], Sapp et al. [16], Andriluka et al. [2] and Yang and Ramanan [22]. We demonstrate that in each case the evaluator is able to predict if the algorithm has correctly estimated the pose or not.

show abstract

Section: Upper Body Detectionmentioning

confidence: 82%

“…Here, we use the publicly available detector of [6]. It combines an upper-body detector based on the part-based model of [8] and a face detector [18].…”

Section: Upper Body Detectionmentioning

confidence: 99%

Section: Datasetsmentioning

confidence: 99%

Section: Assessment Of the Pose Quality Evaluatormentioning

confidence: 99%

See 2 more Smart Citations

Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators

Jammalamadaka

Zisserman

Eichner

et al. 2012

Computer Vision – ECCV 2012

View full text Add to dashboard Cite

show abstract

“…However, these works are limited to cases involving isolated people, and extending them to situations with multiple interacting people is not straightforward. Recently, a model for joint reasoning about poses of multiple upright people has been proposed in [12]. This framework does not output segmentations of people, but can be adapted to do so.…”

Section: Related Workmentioning

confidence: 99%

Pose Estimation and Segmentation of People in 3D Movies

Alahari

Seguin

Šivic

et al. 2013

2013 IEEE International Conference on Computer Vision

View full text Add to dashboard Cite

We seek to obtain a pixel-wise segmentation and pose estimation of multiple people in a stereoscopic video. This involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detection, pose estimation, as well as colour, motion, and disparity cues. Our new model explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies "StreetDance 3D" and "Pina". The dataset contains 2727 realistic stereo pairs and includes annotation of human poses, person bounding boxes, and pixel-wise segmentations for hundreds of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012).

show abstract

Recurrent bidirectional visual human pose retrieval

Sun

Zhang

Akashi

2019

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

Content‐based image retrieval technique is an essential component under various application scenarios. In this paper, instead of visual similarity defined by colors, shapes, or textures, we aim to retrieve images with respect to the visual similarity defined by the human pose. In our framework, all the poses are derived from images, inspired by the recent development of three‐dimension (3D) human pose reconstruction. Furthermore, to make the retrieval more robust against reconstruction error, we propose a recurrent bidirectional similarity measure called recurrent best‐buddies similarity (RBBS). Specifically, we treat the similarity measure between two visual poses as a distance measure between two point vectors, with each point representing one of the reconstructed 3D human pose candidates. We then recur the similarity measure by the displacement of query. As a justification, we verify the validity of RBBS in a one dimension (1D) Gaussian situation. In experiments, we build an original dataset for the retrieval task. Both the qualitative and quantitative results show the usefulness of our framework; the quantitative results evaluated by mean average precision (MAP or mAP) especially demonstrate that RBBS is improved by 14.13% compared to the most competitive alternative methods. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

show abstract

2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images

Cited by 228 publications

References 46 publications

Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators

Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators

Pose Estimation and Segmentation of People in 3D Movies

Recurrent bidirectional visual human pose retrieval

Contact Info

Product

Resources

About