Research on visual saliency initially focused on still images rather than on video content. However, in the recent years, an increasing demand of video saliency appeared for some applications like gaming, editing, video retargeting, smart TV, robot navigation, surveillance, etc. Therefore, remarkable progress has been made first in the understanding on eye tracking data with dynamical stimuli and, in a second time, in the modeling process.There are fundamental differences between videos and still images. For example, each video frame is only observed during a fraction of a second, while a still image can be viewed much longer. Some videos can feature varying camera motion such as tilting, panning, zooming, etc. For this reason, videos are probably viewed differently by human observers than still images, and some comprehensive comparative studies have emerged. In [1], for example, the authors study the influence of tasks on gaze behavior in static and dynamic scenes. In [2], the gaze on static and dynamic scene is compared; it also shows that the center bias decreases with dynamic stimuli.In terms of modeling, static models have first been extended to video. This is the case for GBVS, SDSR, NMPT, or SSOI where authors added dynamic features to their models. Though these existing models are major contributions, video saliency estimation methods should then differ substantially from image saliency methods.