Eye gaze is now used as a content adaptation trigger in interactive media applications, such as customized advertisement in video, and bit allocation in streaming video based on region-of-interest (ROI). The reaction time of a gaze-based networked system, however, is lower-bounded by the network round trip time (RTT). Furthermore, only low-sampling-rate gaze data is available when commonly available webcam is employed for gaze tracking. To realize responsive adaptation of media content even under non-negligible RTT and using common low-cost webcams, we propose a Hidden Markov Model (HMM) based gaze-prediction system that utilizes the visual saliency of the content being viewed. Specifically, our HMM has two states corresponding to two of human's intrinsic gaze behavioral movements, and its model parameters are derived offline via analysis of each video's visual saliency maps. Due to the strong prior of likely gaze locations offered by saliency information, accurate runtime gaze prediction is possible even under large RTT and using common webcam.We demonstrate the applicability of our low-cost gaze prediction system by focusing on ROI-based bit allocation for networked video streaming. To reduce transmission rate of a video stream without degrading viewer's perceived visual quality, we allocate more bits to encode the viewer's current spatial ROI, while devoting fewer bits in other spatial regions. The challenge lies in overcoming the delay between the time a viewer's ROI is detected by gaze tracking, to the time the effected video is encoded, delivered and displayed at the viewer's terminal. To this end, we use our proposed low-cost gaze prediction system to predict future eye gaze locations, so that optimized bit allocation can be performed for future frames. Through extensive subjective testing, we show that bit-rate can be reduced by up to 29% without noticeable visual quality degradation when RTT is as high as 200ms.