Proceedings Ninth IEEE International Conference on Computer Vision 2003
DOI: 10.1109/iccv.2003.1238408
|View full text |Cite
|
Sign up to set email alerts
|

Inferring 3D structure with a statistical image-based shape model

Abstract: We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure" model.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
122
1

Year Published

2005
2005
2013
2013

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 166 publications
(123 citation statements)
references
References 16 publications
0
122
1
Order By: Relevance
“…People tracking is comparatively simpler if multiple calibrated cameras can be used simultaneously. Techniques such as space carving [12,13], 3D voxel extraction from silhouettes [14], fitting to silhouette and stereo data [15][16][17], and skeleton-based techniques [18,19] have been used with some success. If camera motion and background scenes are controlled, so that body silhousttes are easy to extract, these techniques can be very effective.…”
Section: Related Workmentioning
confidence: 99%
“…People tracking is comparatively simpler if multiple calibrated cameras can be used simultaneously. Techniques such as space carving [12,13], 3D voxel extraction from silhouettes [14], fitting to silhouette and stereo data [15][16][17], and skeleton-based techniques [18,19] have been used with some success. If camera motion and background scenes are controlled, so that body silhousttes are easy to extract, these techniques can be very effective.…”
Section: Related Workmentioning
confidence: 99%
“…To learn activities from video, earlier work emphasized tracking and explicit body-part models (e.g., [19,23,22]). In parallel, many methods to estimate body pose have been developed, including techniques using nonlinear manifolds to represent the complex space of joint configurations [12,32,3,16,28,29]; in contrast to our work, such methods assume silhouette (backgroundsubtracted) inputs and/or derive models from mocap data, and are often intended for motion synthesis applications. More recently, researchers have considered how activity classes can be learned directly from lower-level spatiotemporal appearance and motion features-for example, based on bag-of-words models for video (e.g., [15,31]).…”
Section: Related Workmentioning
confidence: 99%
“…This trick has even been employed to produce flipped versions of video sequences for activity recognition [31]. The availability of humanoid models in graphics software, together with mocap data, make it possible to generate synthetic images useful for training action recognition [18] and pose estimation methods [26,12,27]. Web images noisily labeled by tags can also serve as a "free" source of data for action classification [13].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It was successfully used in many applications, such as: human pose recognition (Shotton et al, 2011), object 3D structure inferring (Grauman et al, 2003), shape models learning (Stark et al, 2010), pedestrian detection (Marin et al, 2010) (Pishchulin et al, 2011) (Enzweiler et al, 2008), viewpoint-independent object detection (Liebelt et al, 2008), text recognition (Wang et al, 2011) and keypoints recognition (Ozuysal et al, 2007).…”
Section: Related Workmentioning
confidence: 99%