Multimodal Feature Learning for Gait Biometric Based Human Identity Recognition

Hossain, Emdad; Chetty, Girija

doi:10.1007/978-3-642-42042-9_89

Cited by 18 publications

(6 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The model-free approach is based on extracting gait from VS using feature engineering, as proposed in [36], [37]. Here, deep learning is utilized to automatically extract gait features from VS, which maximizes the use of data variability and eliminates the dependence on handcrafting.…”

Section: A Video Sequencementioning

confidence: 99%

Deep Learning for Monitoring of Human Gait: A Review

2019

View full text Add to dashboard Cite

The essential human gait parameters are briefly reviewed, followed by a detailed review of the state of the art in deep learning for the human gait analysis. The modalities for capturing the gait data are grouped according to the sensing technology: video sequences, wearable sensors, and floor sensors, as well as the publicly available datasets. The established artificial neural network architectures for deep learning are reviewed for each group, and their performance are compared with particular emphasis on the spatiotemporal character of gait data and the motivation for multi-sensor, multi-modality fusion. It is shown that by most of the essential metrics, deep learning convolutional neural networks typically outperform shallow learning models. In the light of the discussed character of gait data, this is attributed to the possibility to extract the gait features automatically in deep learning as opposed to the shallow learning from the handcrafted gait features.

show abstract

Section: A Video Sequencementioning

confidence: 99%

Deep Learning for Monitoring of Human Gait: A Review

2019

View full text Add to dashboard Cite

show abstract

“…Although several papers can be found for the task of human action recognition using deep learning techniques, it is hard to find such type of approaches applied to the problem of gait recognition. In [15], Hossain and Chetty propose the use of Restricted Boltzmann Machines to extract gait features from binary silhouettes, but a very small probe set (i.e. only ten different subjects) were used for validating their approach.…”

Section: Related Workmentioning

confidence: 99%

Automatic Learning of Gait Signatures for People Identification

Castro

Marín-Jiménez

Guil

et al. 2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task with an image resolution eight times lower than the previously reported results (i.e. 80 × 60 pixels).

show abstract

“…He et al[10] proposed a new kind of CNN, named ResNet, which has a large number of convolutional layers and 'residual connections' to avoid the vanishing gradient problem.Although several papers can be found for the task of human action recognition using DL techniques, few works apply DL to the problem of gait recognition. In [22], Hossain and Chetty propose the use of Restricted Boltzmann Machines to extract gait features from binary silhouettes, but a very small probe set (i.e. only ten different subjects) were used for validating their approach.…”

mentioning

confidence: 99%

Multimodal feature fusion for CNN-based gait recognition: an empirical comparison

Castro

Marín-Jiménez

Guil

et al. 2020

Neural Comput & Applic

View full text Add to dashboard Cite

People identification in video based on the way they walk (i.e. gait) is a relevant task in computer vision using a non-invasive approach. Standard and current approaches typically derive gait signatures from sequences of binary energy maps of subjects extracted from images, but this process introduces a large amount of non-stationary noise, thus, conditioning their efficacy. In contrast, in this paper we focus on the raw pixels, or simple functions derived from them, letting advanced learning techniques to extract relevant features. Therefore, we present a comparative study of different Convolutional Neural Network (CNN) architectures on three low-level features (i.e. gray pixels, optical flow channels and depth maps) on two widely-adopted and challenging datasets: TUM-GAID and CASIA-B. In addition, we perform a comparative study between different early and late fusion methods used to combine the information obtained from each kind of low-level features. Our experimental results suggest that (i ) the use of hand-crafted energy maps (e.g. GEI) is not necessary, since equal or better results can be achieved from the raw pixels; (ii ) the combination of multiple modalities (i.e. gray pixels, optical flow and depth maps) from different CNNs allows to obtain state-of-the-art results on the gait task with an image resolution several times smaller than the previously reported results; and, (iii ) the selection of the architecture is a critical point that can make the difference between state-of-the-art results or poor results. He et al.[10] proposed a new kind of CNN, named ResNet, which has a large number of convolutional layers and 'residual connections' to avoid the vanishing gradient problem.Although several papers can be found for the task of human action recognition using DL techniques, few works apply DL to the problem of gait recognition. In [22], Hossain and Chetty propose the use of Restricted Boltzmann Machines to extract gait features from binary silhouettes, but a very small probe set (i.e. only ten different subjects) were used for validating their approach. A more recent work, [23], uses a random set of binary silhouettes of a sequence to train a CNN that accumulates the calculated features in order to achieve a global representation of the dataset. In [24], raw 2D GEI are employed to train an ensemble of CNN, where a Multilayer Perceptron (MLP) is used as classifier. Similarly, in [25] a multilayer CNN is trained with GEI data. A novel approach based on GEI is developed on [8], where the CNN is trained with pairs of gallery-probe samples and using a distance metric. Castro et al.[26] use optical flow obtained from raw data frames. An in-dept evaluation of different CNN architectures based on optical flow maps is presented in [27]. Finally, in [28] a multitask CNN with a combined loss function with multiple kind of labels is presented.Despite most CNNs are trained with visual data (e.g. images or videos), there are some works that build CNNs for different kinds of data like inertial sensors or human skel...

show abstract

Multimodal Feature Learning for Gait Biometric Based Human Identity Recognition

Cited by 18 publications

References 17 publications

Deep Learning for Monitoring of Human Gait: A Review

Deep Learning for Monitoring of Human Gait: A Review

Automatic Learning of Gait Signatures for People Identification

Multimodal feature fusion for CNN-based gait recognition: an empirical comparison

Contact Info

Product

Resources

About