“…However, detecting and tracking the bare hand are challenging, and the complexity increases further in practical environments, for example, in a natural crowded office/home environment (Saboo & Singha, 2021; Singha et al, 2018). This is one of the reasons why most researchers (Gao et al, 2021; Misra & Hussain Laskar, 2019; Misra & Laskar, 2017; Misra & Laskar, 2019; Misra & Laskar, 2019b; Mukherjee et al, 2019; Nguyen et al, 2018; Saboo & Singha, 2021; Singha et al, 2018) implemented bare hand gesture detection work with the controlled environment using shallow architecture models. Some of the complexities in detecting gesture objects are gesticulated speed, variation due to illumination, pose, scale, size, backgrounds, occlusion and presence of imposter‐like objects.…”