In the last years, the interest for advanced video-based surveillance applications is more and more growing. This is especially true in the field of railway urban transport where video-based surveillance can be exploited to face many relevant security aspects (e.g. vandal acts, overcrowding situations, abandoned object detection, etc.). This paper 1 aims at investigating an open problem in the implementation of videobased surveillance systems for transport applications, i.e.: the implementation of reliable image understanding modules in order to recognize dangerous situations with reduced false alarm and misdetection rates. In this work, we considered the use of a neural network-based classifier for detecting vandal behaviors in metro stations. The achieved results show that the classifier choice mentioned above allows one to achieve very good performances also in presence of high scene complexity.