This paper describes an approach to segment and locate people in crowded scenarios with application to a surveillance system for airport dependencies. To obtain robust operation, the system analyzes a variety of visual cues -color, motion and shape-and integrates them optimally. A general method for automatic inference of optimal cue integration rules is presented. This schema, based on supervised training on video sequences, avoids the need of explicitly formulate combination rules based on a-priori constraints. The performance of the system is at least as good as classical fusing strategies like those based on voting, because the optimized decision engine implicitly includes these and other strategies.
We study two different sets of features with the aim of classifying objects from videos taken in an airport. Objects are classified into three different classes: single person, group of people, and luggage. We have used two different feature sets, one set based on classical geometric features, and another based on average density of foreground pictures in areas of the blobs. In both cases, easily computed features were selected because our system must run under real-time constraints. During the development of the algorithms, we also studied if shadows affect the classification rate of objects. We achieved this by applying two shadow removal algorithms to estimate the usefulness of such techniques under real-time constraints.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.