“…Specifically, [20], [16] explicitly leverage the predetermined homography between camera and ground plane to determine the 3D position on the floor of each person locating his/her feet. Additionally, [16] tries to infer the feet position even when they are occluded through a deep neural network. Differently, [19] given an approximated knowledge of the orientation of the camera with respect to the ground plane, by assuming a fixed parameter of the human body (specifically, the constant height of 1.70 m) inferred by a deep network, detects social distancing violations by leveraging the heuristically determined torso length of each person.…”