Video object matching across multiple independent views using local descriptors and adaptive learning

Teixeira, Luís; Côrte-Real, Luís

doi:10.1016/j.patrec.2008.04.001

Cited by 43 publications

(26 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Consequently, their identities can be switched. Moreover, prolonged occlusion might occur, which might lead to track loss or mistaken identities (Teixeira and Corte-Real 2009). Since the cameras are supposed to track all objects in their coverage area, the definition of a global identity for each object is necessary.…”

Section: Relevance and Problem Definitionmentioning

confidence: 99%

Learning from evolving video streams in a multi-camera scenario

2015

View full text Add to dashboard Cite

Nowadays, video surveillance systems are taking the first steps toward automation, in order to ease the burden on human resources as well as to avoid human error. As the underlying data distribution and the number of concepts change over time, the conventional learning algorithms fail to provide reliable solutions for this setting. In this paper, we formalize a learning concept suitable for multi-camera video surveillance and propose a learning methodology adapted to that new paradigm. The proposed framework resorts to the universal background model to robustly learn individual object models from small samples and to more effectively detect novel classes. The individual models are incrementally updated in an ensemble-based approach, with older models being progressively forgotten. The framework is designed to detect and label new concepts automatically. The system is also designed to exploit active learning strategies, in order to interact wisely with operator, requesting assistance in the most ambiguous to classify observations. The experimental results obtained both on real and synthetic data sets verify the usefulness of the proposed approach.

show abstract

Section: Relevance and Problem Definitionmentioning

confidence: 99%

Learning from evolving video streams in a multi-camera scenario

2015

View full text Add to dashboard Cite

show abstract

“…As an alternative, a classifier can be trained to learn the changes between cameras using labeled features. Support Vector Machines (SVM) can be employed with DCT features (Bauml et al, 2010) and SIFT (Teixeira and Corte-Real, 2009). An improvement is the Ensemble SVM, which reduces the computational cost of rankSVM for high-dimensional feature spaces besides converting the re-identification problem into a ranking problem (Prosser et al, 2010).…”

Section: Associationmentioning

confidence: 99%

“…Finally, interest points can be used for re-identification in case of variations in scale, pose and illumination (Bauml and Stiefelhagen, 2011). Examples are SIFT (Teixeira and Corte-Real, 2009), SURF-like features (Hamdoun et al, 2008;Oliveira and Luiz, 2009) and the Hessian Affine invariant operator (Gheissari et al, 2006). When intra-camera tracking information is available, features extracted from single images can be grouped over time either by temporal accumulation (Hamdoun et al, 2008) or by clustering (Farenzena et al, 2010).…”

mentioning

confidence: 99%

Person re-identification in crowd

Mazzon

Tahir

Cavallaro

2012

Pattern Recognition Letters

View full text Add to dashboard Cite

Person re-identification aims to recognize the same person viewed by disjoint cameras at different time instants and locations. In this paper, after an extensive review of state-of-the-art approaches, we propose a re-identification method that takes into account the appearance of people, the spatial location of cameras and potential paths a person can choose to follow. This choice is modeled with a set of areas of interest (landmarks) that constrain the propagation of people trajectories in non-observed regions between the field-of-view of cameras. We represent people with a selective patch around their upper body to work in crowded scenes when occlusions are frequent. We demonstrate the proposed method in a challenging scenario from London Gatwick airport and compare it to well-known person re-identification methods, highlighting their strengths and limitations. Finally, we show by Cumulative Matching Characteristic curve that the best performance results by modeling people movements in non-observed regions combined with appearance methods, achieving an average improvement of 6% when only appearance is used and 15% when only motion is used for the association of people across cameras.

show abstract

“…In [4], each object is represented as a "bag-of-visterms" where the visual words are local features. A model is created for each individual detected in the site.…”

Section: Related Workmentioning

confidence: 99%

“…When the cameras have overlapping fields of view, information about the geometrical relations among the camera views can be estimated and used to establish correspondences [1,2]. In the case of disjoint views, other information about the moving objects must be used to automatically identify multiple instances of the same object [3,4].…”

Section: Introductionmentioning

confidence: 99%

Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors

Presti

Sclaroff

Cascia

2009

Image Analysis and Processing – ICIAP 2009

View full text Add to dashboard Cite

Abstract. Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras' fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use appearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the LDA bag-of-words model for appearance. The encoded appearance is then used to establish probable matching across cameras. Preliminary experiments are conducted on a dataset of 20 individuals and comparison against Madden's I-MCHR is reported.

show abstract

Video object matching across multiple independent views using local descriptors and adaptive learning

Cited by 43 publications

References 19 publications

Learning from evolving video streams in a multi-camera scenario

Learning from evolving video streams in a multi-camera scenario

Person re-identification in crowd

Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors

Contact Info

Product

Resources

About