2020
DOI: 10.3390/s20185258
|View full text |Cite
|
Sign up to set email alerts
|

VI-Net—View-Invariant Quality of Human Movement Assessment

Abstract: We propose a view-invariant method towards the assessment of the quality of human movements which does not rely on skeleton data. Our end-to-end convolutional neural network consists of two stages, where at first a view-invariant trajectory descriptor for each body joint is generated from RGB images, and then the collection of trajectories for all joints are processed by an adapted, pre-trained 2D convolutional neural network (CNN) (e.g., VGG-19 or ResNeXt-50) to learn the relationship amongst the different bo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 22 publications
(21 citation statements)
references
References 46 publications
0
21
0
Order By: Relevance
“…In United Kingdom an interest group called Digital and Informatics Physiotherapy Group (DIPG) (part of the Chartered Society of Physiotherapists, CSP) was formed to develop, evaluate and promote was formed to promote the use of novel technologies such as virtual reality (VR), telerehabilitation and AI in clinical physiotherapy. Having possibility of patients exercising at their homes with automatic real-time exercise feedback [161,173,209,294,477,538,595,674,675,704,815] could positively inuence exercise adherence leading to improved patient outcomes and lowered healthcare costs [445,884]. Similarly, these personalised and gamied approaches could be designed for general population as a preventive health measure [825], for example to increase the uptake of resistance training [63,867], as 80% of European adults do not meet the global resistance training guidelines of 2 or more days of resistance training per week [62].…”
Section: Precision Strength Trainingmentioning
confidence: 99%
“…In United Kingdom an interest group called Digital and Informatics Physiotherapy Group (DIPG) (part of the Chartered Society of Physiotherapists, CSP) was formed to develop, evaluate and promote was formed to promote the use of novel technologies such as virtual reality (VR), telerehabilitation and AI in clinical physiotherapy. Having possibility of patients exercising at their homes with automatic real-time exercise feedback [161,173,209,294,477,538,595,674,675,704,815] could positively inuence exercise adherence leading to improved patient outcomes and lowered healthcare costs [445,884]. Similarly, these personalised and gamied approaches could be designed for general population as a preventive health measure [825], for example to increase the uptake of resistance training [63,867], as 80% of European adults do not meet the global resistance training guidelines of 2 or more days of resistance training per week [62].…”
Section: Precision Strength Trainingmentioning
confidence: 99%
“…In general, methods that rely on skeleton annotations must rely on fulsome 3D joint representations which are difficult to come by in in-the-wild scenarios. Recently a few works have developed view-invariant action recognition or analysis approaches from RGB-D images, such as [9,11,21,32]. Varol et al [9] deploy multi-view synthetic videos for training their network to perform action recognition given novel viewpoints, but still use 3D pose annotations to produce the synthetic data, while the newly generated videos would have to be also labelled by experts if they were to be used for specialist applications such as healthcare.…”
Section: Related Workmentioning
confidence: 99%
“…To tackle this problem, a simple solution would be to train a network on data from multiple views [9]. However, in practice, capturing a labelled dataset of different views is cumbersome and rare -but two example cases are the commonly used NTU [10] and the recent health-related QMAR [11] datasets, where the granularity of the labelling is still coarse at action class level and at overall performance score level respectively. Ideally, a wholly view-invariant approach would be trained on data from as few views as possible and be able to perform well on a single (unseen) view at inference time.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Spatiotemporal representations are densely packed with information regarding both the appearance and salient motion patterns occurring in the video clips, as illustrated in Figure 1. Due to this representational power, they are currently the best performing models on action-related tasks, such as action recognition [1][2][3][4], action quality assessment [5][6][7][8][9], skills assessment [10], and action detection [11]. This representation power comes at the cost of increased computational complexity [12][13][14][15], which makes 3D-CNNs unsuitable for deployment in resource-constrained scenarios.…”
Section: Introductionmentioning
confidence: 99%