“…Models of visual attention, such as the one proposed by Itti et al [11] or Harel's graph implementation [12] are frequently used in literature for computing saliency maps. Various authors have shown how driving the processing to those particular areas with high values in the saliency maps improves the system performance in various computer vision tasks, such as image retrieval [13], object recognition [14,15], object tracking [16,17], or action recognition [18,19]. However, although much fundamental work has been done to generate good representations of visual saliency from still images or video content, their ap-plication to object recognition has not been yet explored in-depth.…”