“…UTKinect-Action3D [63], CAD-60 [64,65], LIRIS human activities Product score fusion (late fusion) [34] Visual (RGB) and depth dynamic images c-ConvNet ChaLearn LAP IsoGD [61], NTU RGB+D [53] Fusion score (late fusion) [33] RGB and depth frames Pre-trained VGG networks MSRDaily activity 3D [49], UTD-MHAD [51], CAD-60 [64,65] early, intermediate, and late fusion [13] RGB, depth, skeleton I3D and shift-GCN NTU RGB+D [53], SBU interaction [66] Cross-modality compensation block (intermediate) [42] RGB and depth ResNet and VGG with CMCB NTU RGB+D 120 [56], THU-READ [58], PKU-MMD [67] SlowFast multimodality compensation block (intermediate) [40] RGB & Depth features Swin transformer NTU RGB+D 120 [56], NTU RGB+D 60 [53], THU-READ [58], PKU-MMD [67] Cross-modality fusion transformer (intermediate) [41] RGB and depth features Restnet50 feature extractors NTU RGB+D 120 [56], THU-READ [58], PKU-MMD [67]…”