“…Recent RGB (red, green, blue) based action analysis methods, such as References [ 2 , 3 , 4 , 6 ], are not able to deal with view-invariance when applied to viewpoints significantly different to their training data. To achieve some degree of invariance, some works such as References [ 7 , 8 , 9 , 10 , 11 , 12 , 13 ], have made use of 3D human pose obtained from (i) Kinect, (ii) motion capture, or (iii) 3D pose estimation methods.…”