“…(i) object detection, which focuses on identifying and locating a predefined set of objects of interest (see, e.g., Bochkovskiy et al, 2020), (ii) face recognition, which is able to identify specific individuals on the basis of their faces (see, e.g., Learned-Miller et al, 2016), (iii) activity recognition, which involves recognizing actions of interest performed for instance by humans and vehicles (see, e.g., Jobanputra et al, 2019), and (iv) crowd violence detection, which focuses on detecting outbreaks of crowd violence (see, e.g., Gkountakos et al, 2020).…”