Abstract. Still-to-video face recognition (FR) systems used in video surveillance applications capture facial trajectories across a network of distributed video cameras and compare them against stored distributed facial models. Currently, the performance of state-of-the-art systems is severely affected by changes in facial appearance caused by variations in, e.g., pose, illumination and scale in different camera viewpoints. Moreover, since an individual is typically enrolled using one or few reference stills captured during enrolment, face models are not robust to intra-class variation. In this paper, the Extended Sparse Representation Classification through Domain Adaptation (ESRC-DA) algorithm is proposed to improve performance of still-to-video FR. The system's facial models are thereby enhanced by integrating variational information from its operational domain. In particular, robustness to intra-class variations is improved by exploiting: (1) an under-sampled dictionary from target reference facial stills captured under controlled conditions; and (2) an auxiliary dictionary from an abundance of unlabelled facial trajectories captured under different conditions, from each camera viewpoint in the surveillance network. Accuracy and efficiency of the proposed technique is compared to state-of-the-art still-to-video FR techniques using videos from the Chokepoint and COX-S2V databases. Results indicate that ESRC-DA with dictionary learning of unlabelled trajectories provides the highest level of accuracy, while maintaining a low complexity.
IntroductionWith the availability of low-cost video cameras and high capacity memory, technologies for video surveillance (VS) have become more prevalent in recent years. VS networks are increasingly deployed by public security organizations in e.g., airports, train stations and border crossings. Accurate and robust systems are required to recognize individuals and their actions from video feeds. In VS, decision support systems can rely on facial information (along with other sources, like soft biometrics) to alert an analyst as to the presence of individuals of interest. The ability to automatically recognize faces in videos recorded acrossThe final publication is available at Springer via http://dx