A gaze prediction technique for open signed video content using a track before detect algorithm

Davies, S.J.C.; Agrafiotis, Dimitris; Canagarajah, C N; Bull, David

doi:10.1109/icip.2008.4711852

Cited by 4 publications

(2 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In one approach, the ROI is determined a priori based on saliency maps [11] obtained solely based on content analysis, typically using low-level video features such as spatial contrasts in luminance, temporal changes in motion, appearances of machine-recognized human faces, etc. It has been shown [12,13], however, that prior knowledge and context play important roles in affecting viewer's attention, and modeling these information when calculating saliency maps is a daunting task. In contrast, while we use video content to train HMM parameters during training phase, in operational phase we determine ROI based on real-time eye gaze tracking.…”

Section: Related Workmentioning

confidence: 99%

Hidden Markov Model for eye gaze prediction in networked video streaming

Feng

Cheung

Tan

et al. 2011

2011 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

With the advent of eye gaze tracking technology, eye gaze is increasingly being used as a media interaction trigger in a variety of applications, such as eye typing, video content customization, and network video streaming based on region-of-interest (ROI). The reaction time of a gaze-based networked system, however, is in practice lower-bounded by the round trip time (RTT) of today's networks, which can be large. To improve the efficacy of gaze-based networked systems, in the paper we propose a Hidden Markov Model (HMM)-based gaze prediction strategy to predict future gaze locations to lower end-to-end reaction delay. We first design an HMM with three states corresponding to human's three major types of intrinsic eye movements. HMM parameters are obtained offline on a per-video basis during training phase. During testing phase, a window of noisy gaze observations are collected in real-time as input to a forward algorithm, which computes the most likely HMM state. Given the deduced HMM state, linear prediction is used to predict gaze location RTT seconds into the future.We demonstrate the applicability of our gaze prediction strategy by focusing on ROI-based bit allocation for network video streaming. To reduce transmission rate of a video stream without degrading viewer's perceived visual quality, we allocate more bits to encode the viewer's current spatial ROI, while devoting fewer bits in other spatial regions. The challenge lies in overcoming the delay between the time a viewer's ROI is detected by gaze tracking, to the time the effected video is encoded, delivered and displayed at the viewer's terminal. To this end, we use our proposed gaze-prediction strategy to predict future eye gaze locations, so that optimized bit allocation can be performed for future frames. Our experiments show that bit rate can be reduced by 21% without noticeable visual quality degradation when end-to-end network delay is as high as 200ms.

show abstract

Section: Related Workmentioning

confidence: 99%

Hidden Markov Model for eye gaze prediction in networked video streaming

Feng

Cheung

Tan

et al. 2011

2011 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

show abstract

“…Nevertheless, it has been shown [39,40] that prior knowledge and context play important roles in affecting viewer's attention. Thus, video analysis can at best provide a rough estimate of where viewers may look, in the absence of real-time information.…”

Section: Roi-based Bit Allocation For Video Coding / Streamingmentioning

confidence: 99%

Low-Cost Eye Gaze Prediction System for Interactive Networked Video Streaming

Feng

Cheung

Tan

et al. 2013

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Eye gaze is now used as a content adaptation trigger in interactive media applications, such as customized advertisement in video, and bit allocation in streaming video based on region-of-interest (ROI). The reaction time of a gaze-based networked system, however, is lower-bounded by the network round trip time (RTT). Furthermore, only low-sampling-rate gaze data is available when commonly available webcam is employed for gaze tracking. To realize responsive adaptation of media content even under non-negligible RTT and using common low-cost webcams, we propose a Hidden Markov Model (HMM) based gaze-prediction system that utilizes the visual saliency of the content being viewed. Specifically, our HMM has two states corresponding to two of human's intrinsic gaze behavioral movements, and its model parameters are derived offline via analysis of each video's visual saliency maps. Due to the strong prior of likely gaze locations offered by saliency information, accurate runtime gaze prediction is possible even under large RTT and using common webcam.We demonstrate the applicability of our low-cost gaze prediction system by focusing on ROI-based bit allocation for networked video streaming. To reduce transmission rate of a video stream without degrading viewer's perceived visual quality, we allocate more bits to encode the viewer's current spatial ROI, while devoting fewer bits in other spatial regions. The challenge lies in overcoming the delay between the time a viewer's ROI is detected by gaze tracking, to the time the effected video is encoded, delivered and displayed at the viewer's terminal. To this end, we use our proposed low-cost gaze prediction system to predict future eye gaze locations, so that optimized bit allocation can be performed for future frames. Through extensive subjective testing, we show that bit-rate can be reduced by up to 29% without noticeable visual quality degradation when RTT is as high as 200ms.

show abstract