2022
DOI: 10.3390/s22197140
|View full text |Cite
|
Sign up to set email alerts
|

Gait Recognition with Self-Supervised Learning of Gait Features Based on Vision Transformers

Abstract: Gait is a unique biometric trait with several useful properties. It can be recognized remotely and without the cooperation of the individual, with low-resolution cameras, and it is difficult to obscure. Therefore, it is suitable for crime investigation, surveillance, and access control. Existing approaches for gait recognition generally belong to the supervised learning domain, where all samples in the dataset are annotated. In the real world, annotation is often expensive and time-consuming. Moreover, convolu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 40 publications
0
7
0
Order By: Relevance
“…In 195º , the approach is close to 90%, and the highest score was 91.80%, achieved by SelfGait [48]. The second score achieved by ViTs16 [45] was 90.57%.…”
Section: Discussionmentioning
confidence: 76%
See 1 more Smart Citation
“…In 195º , the approach is close to 90%, and the highest score was 91.80%, achieved by SelfGait [48]. The second score achieved by ViTs16 [45] was 90.57%.…”
Section: Discussionmentioning
confidence: 76%
“…Pinčić et al [45] developed a new method that uses selfsupervised learning (SSL) to complete the gait identification test. They used the vision transformer (ViT) architecture presented in the self-supervised DINO approach for image classification.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, the Transformer model has been applied in gait recognition tasks. For instance, Mogan et al [28] and Pinvcic et al [29] directly employed the Vision Transformer (ViT) model on gait silhouettes. These methods involve converting gait silhouette images into onedimensional sequences, followed by feature extraction and classification using the ViT model.…”
Section: Transformermentioning
confidence: 99%
“…By stacking attentional layers that scan the sequence, Transformers are capable of producing position and context aware representations. Inspired by Transformers, a few attempts have been made to introduce transformer-like architectures to vision tasks [29], [30], one of which, called vision transformer (ViT) [31], has been successfully applied for image recognition and shows competitive performance [32], [33]. Hussain et al [34] explored a pretrained Vision Transformer to extract frame-level features and then passed the features to a long short-term memory to recognize human activities.…”
Section: Table I the Representative Sample Data Collected During Gait...mentioning
confidence: 99%