We present a robust representation for gait recognition
. IntroductionPossibilities for gait recognition was demonstrated in the early 70s using light point displays. Over the past few years, a variety of approaches to gait recognition have been proposed, almost all of which are based on matching silhouettes of persons. Approaches to gait recognition, typically, involves matching time series of features extracted in each frame. The possible frame features include just the raw silhouettes as in the gait baseline algorithm [8], shape PCA coefficients [15], shape moments [12], silhouette width vector [11,4,13], and body part ellipses [10]. The matching of the trajectories of these features rely on simple spatio-temporal correlation [15,8,11], or matching maps of silhouette correlations [1], or dynamic time warping and HMM [4,13]. Apart from these classes of approaches that tend to emphasize both the shape of the silhouette and its evolution over time, there are approaches that emphasize just the shape [3,14] or use static body parameters [6] with 0 This research was supported by funds from DARPA (F49620-00-1-00388) and NSF (EIA 0130768).almost equal or better performance than the first class of approaches.Given this collective wisdom about gait recognition, it is pertinent to ask: what is the simplest representation that suffices for gait recognition? The quest for the simplest representation is meaningful from both a computational point of view and from a robustness point of view; simpler ideas tend to be generalizable across a wider range of conditions. Towards this end, we propose the averaged silhouette representation; we simply consider the sum of the silhouettes over approximately one gait cycle as the gait representation. The matching process simply considers the Euclidean distances between these average silhouettes. The representation is robust with respect to gait cycle length estimates and does not depend on the choice of the starting stance of the gait cycle. There is no need for stance matching or gait alignment before matching. The idea behind the proposed representation is somewhat similar to the summed symmetry maps proposed in [5], where bilateral symmetry map of each silhouette is first extracted and then summed over one gait cycle. We, however, do not even extract the symmetry maps. The use of cumulative images for motion-based human activity recognition is not new. Bobick and Davis [2] used temporal template, a vector-image constructed by weighted image-differencing through the motion history, to identify different human activities, such as sit-down, arms-wave, and crouch down. We show that this kind of representation seem to be sufficient also for recognition.We use the HumanID Gait Challenge framework [8] to demonstrate the efficacy of the proposed representation. The challenge problem, which is being used by the gait community [13,14,9], consists of a baseline algorithm, a set of twelve experiments (A through L), and a large data set (1870 sequences, 122 subjects, 1.2 Terabytes of d...