2017
DOI: 10.1007/978-3-319-73013-4_20
|View full text |Cite
|
Sign up to set email alerts
|

Organizing Multimedia Data in Video Surveillance Systems Based on Face Verification with Convolutional Neural Networks

Abstract: Abstract. In this paper we propose the two-stage approach of organizing information in video surveillance systems. At first, the faces are detected in each frame and a video stream is split into sequences of frames with face region of one person. Secondly, these sequences (tracks) that contain identical faces are grouped using face verification algorithms and hierarchical agglomerative clustering. Gender and age are estimated for each cluster (person) in order to facilitate the usage of the organized video col… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 20 publications
0
9
0
Order By: Relevance
“…Only each of, for example, three or five frames, is selected in each video clip, extract identity features of all detected faces and initially cluster only the faces found in this clip. After that the normalized average of identity features of all clusters (Sokolova, Kharchevnikova & Savchenko, 2017) are computed. They are added to the dataset {X r } so that the "Facial clustering" module handles both features of all photos and average feature vectors of subjects found in all videos.…”
Section: Proposed Pipeline For Organizing Photo and Video Albumsmentioning
confidence: 99%
See 1 more Smart Citation
“…Only each of, for example, three or five frames, is selected in each video clip, extract identity features of all detected faces and initially cluster only the faces found in this clip. After that the normalized average of identity features of all clusters (Sokolova, Kharchevnikova & Savchenko, 2017) are computed. They are added to the dataset {X r } so that the "Facial clustering" module handles both features of all photos and average feature vectors of subjects found in all videos.…”
Section: Proposed Pipeline For Organizing Photo and Video Albumsmentioning
confidence: 99%
“…Nowadays, due to the extreme increase in multimedia resources, there is an urgent need to develop intelligent methods to process and organize them (Manju & Valarmathie, 2015). For example, the task of automatic structuring of photo and video albums is attracting increasing attention (Sokolova, Kharchevnikova & Savchenko, 2017;Zhang & Lu, 2002). The various photo organizing systems allow users to group and tag photos and videos in order to retrieve large number of images in the media library (He et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…The same procedure is repeated for all video files. Only each of, e.g., 3 or 5 frames, is selected in each video clip, extract identity features of all detected faces and initially cluster only the faces found (Sokolova et al, 2017) are computed. They are added to the dataset {X r } so that the "Facial clustering" module handles both features of all photos and average feature vectors of subjects found in all videos.…”
Section: Proposed Pipeline For Organizing Photo and Video Albumsmentioning
confidence: 99%
“…Nowadays, due to the extreme increase in multimedia resources there is an urgent need to develop intelligent methods to process and organize them (Manju and Valarmathie, 2015). For example, the task of automatic structuring of photo and video albums is attracting increasing attention (Sokolova et al, 2017;Zhang and Lu, 2002). The various photo organizing systems allow users to group and tag photos and videos in order to retrieve large number of images in the media library (He et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…For example, they collect thousands of images (frames) every second [5,6,7]. Consequently, there is a challenge of ordering the visitors, whose faces were observed by a surveillance system [8].…”
Section: Introductionmentioning
confidence: 99%