“…Since the cameras are fixed, the object segmentation and the single-view tracking can be exploited using the background suppression: differently from the perimeter layer, the vision in this layer is indoor therefore simpler statistical approaches like [15] could be equally effective but more efficient; to deal with object occlusions within the same view, some appearance models (based on color, texture, contour, etc) can be exploited [17,23]. Vision-based people counting at gates has been widely explored in the literature [29,2,6], but, apart from ad-hoc approaches, it could be interpreted also as the outcome of a correct people tracker (see [31] for a survey) which observes all the entrance/exit gates of an environment, that is our case. To increase the accuracy of the counting, additional sensors could be deployed, as will be detailed in our case study in section 3.…”