2018
DOI: 10.1007/978-3-030-01264-9_41
|View full text |Cite
|
Sign up to set email alerts
|

Human Motion Analysis with Deep Metric Learning

Abstract: Effectively measuring the similarity between two human motions is necessary for several computer vision tasks such as gait analysis, person identification and action retrieval. Nevertheless, we believe that traditional approaches such as L2 distance or Dynamic Time Warping based on hand-crafted local pose metrics fail to appropriately capture the semantic relationship across motions and, as such, are not suitable for being employed as metrics within these tasks. This work addresses this limitation by means of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 43 publications
(19 citation statements)
references
References 32 publications
0
19
0
Order By: Relevance
“…Sentence Embedding/Retrieval: Existing works have considered the task of defining metrics for quantifying, ranking and retrieving similar images [34], images with specific properties [35] or hierarchical similarities [20], [36]. A deep metric learning approach [37], [38] is often taken, using contrastive or discriminative methods [19] to train a similarity metric into the embedding. For application into videos, action segmentation methods have often considered embedding at the frame level to retrieve frames with similar actions in unsupervised [39], [40] methods, or recently the aforementioned DTW-based transcript prototypes [8].…”
Section: Related Workmentioning
confidence: 99%
“…Sentence Embedding/Retrieval: Existing works have considered the task of defining metrics for quantifying, ranking and retrieving similar images [34], images with specific properties [35] or hierarchical similarities [20], [36]. A deep metric learning approach [37], [38] is often taken, using contrastive or discriminative methods [19] to train a similarity metric into the embedding. For application into videos, action segmentation methods have often considered embedding at the frame level to retrieve frames with similar actions in unsupervised [39], [40] methods, or recently the aforementioned DTW-based transcript prototypes [8].…”
Section: Related Workmentioning
confidence: 99%
“…It's worth noting that visual data is only one type of information that can be utilized to recognize activities. RGB-D data in the context of deep learning to distinguish human actions is an example of how various types can be used [19], [20]. Sudhakaran et al [21] also published a hierarchical feature lightweight aggregation approach that can be integrated into any deep architecture with a CNN backbone in another project.…”
Section: Related Work and Contributionsmentioning
confidence: 99%
“…On the other hand only Visual data was used in action recognition. For instance, RGB-D data have been used in deep learning to recognize human actions [13,22]. Sudhakaran et al [35] presented a hierarchical feature lightweight aggregation scheme that can be plugged into any deep architecture with CNN backbone.…”
Section: Related Workmentioning
confidence: 99%