Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.
DOI: 10.1109/cvpr.2004.1315253
|View full text |Cite
|
Sign up to set email alerts
|

Names and faces in the news

Abstract: We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured "in the wild" in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
296
0
5

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 284 publications
(310 citation statements)
references
References 19 publications
1
296
0
5
Order By: Relevance
“…no text). The difference in the difficulty is apparent by comparing the examples in [6] with those used for evaluation in §3. For example, in [6] the face image size is restricted to be at least 86 × 86 pixels, whilst a significant number of faces we use are of lower resolution.…”
Section: Previous Workmentioning
confidence: 99%
See 2 more Smart Citations
“…no text). The difference in the difficulty is apparent by comparing the examples in [6] with those used for evaluation in §3. For example, in [6] the face image size is restricted to be at least 86 × 86 pixels, whilst a significant number of faces we use are of lower resolution.…”
Section: Previous Workmentioning
confidence: 99%
“…Fitzgibbon and Zisserman [14] investigated face clustering in feature films, though without explicitly using facial features for registration. Berg et al [6] consider the problem of clustering detected frontal faces extracted from web news pages. In a similar manner to us, affine registration with an underlying SVM-based facial feature detector is used for face rectification.…”
Section: Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Sentence semantics only provides ambiguous and implicit labels. This resembles another line of work that learns structured output from image captions (Berg et al 2004;Gupta and Davis 2008;Luo et al 2009;Jamieson et al 2010a, b;Plummer et al 2015;Mao et al 2016), treating the input as a parallel image-text dataset. However, all of these methods, except Gupta and Davis (2008) and Jamieson et al (2010a, b) use pretrained object models learned from other datasets.…”
Section: Related Workmentioning
confidence: 99%
“…Everingham et al [26,27] addressed the problem of automatically labeling faces of characters in TV or film materials with their names. Similar to the "Faces in the News" labeling in [16], where detected frontal faces in news images are tagged with names appearing in the news story text, they proposed to combine visual cues (face and cloth) and textual cues (subtitle and transcript) for assigning names. Regarding face processing [3], face detections in each frame are linked to derive face tracks, and each face is represented by local appearance descriptors computed around 13 facial features.…”
Section: Face Retrieval In Videomentioning
confidence: 99%