Sparkle: User-Aware Viewport Prediction in 360-Degree Video Streaming

Chen, Jinyu; Luo, Xianzhuo; Hu, Miao; Wu, Di; Zhou, Yipeng

doi:10.1109/tmm.2020.3033127

Cited by 30 publications

(11 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Trajectory-based approaches [12]- [16], [22]- [24], [40]- [42] predict future viewing direction from one user's (single-user) or other users' (cross-user) historical head movement trajectories. [12], [40]- [42] proposed to use historical head movement data to predict future FoV.…”

Section: B Trajectory-based Prediction Methodsmentioning

confidence: 99%

“…Overall, the combination of video saliency detection and historical head motion trajectories (real-time) of users can be used to predict users' FoV in the near future. Generally, FoV prediction algorithms can be divided into two categories: trajectory-based [12]- [16], [22]- [27] and content-based [17]- [20], [28]- [39].…”

Section: Fovmentioning

confidence: 99%

“…In addition, to gain insight into the head movement of the user while watching the 360-degree video, we introduce an additional set of measures to investigate the change in FoV in successive frames. For demonstration purposes, we measure the head movement frequency by calculating the average latitude/longitude difference between two consecutive frames in the whole video and set a threshold range for longitude and latitude [16]. If the mean longitude is greater than 0.65 • , it is defined as a higher frequency of head movement in the horizontal direction (denoted as More).…”

Section: A Experimental Setupmentioning

confidence: 99%

“…However, considering only head movements of users for FoV prediction is inaccurate. Therefore, other studies [14]- [16] have combined historical view trajectories with video object tracking to predict future FoV. For example, [14] proposes a deep learning-based FoV prediction scheme, HOP, which jointly exploits the viewer's historical FoV trajectory and target tracking through a long short-term memory (LSTM) network.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Spherical Convolution empowered FoV Prediction in 360-degree Video Multicast with Limited FoV Feedback

Li¹,

Liu²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Field of view (FoV) prediction is critical in 360-degree video multicast, which is a key component of the emerging Virtual Reality (VR) and Augmented Reality (AR) applications. Most of the current prediction methods combining saliency detection and FoV information neither take into account that the distortion of projected 360degree videos can invalidate the weight sharing of traditional convolutional networks, nor do they adequately consider the difficulty of obtaining complete multi-user FoV information, which degrades the prediction performance. This paper proposes a spherical convolution-empowered FoV prediction method, which is a multi-source prediction framework combining salient features extracted from 360degree video with limited FoV feedback information. A spherical convolution neural network (CNN) is used instead of a traditional two-dimensional CNN to eliminate the problem of weight sharing failure caused by video projection distortion. Specifically, salient spatial-temporal features are extracted through a spherical convolution-based saliency detection model, after which the limited feedback FoV information is represented as a time-series model based on a spherical convolution-empowered gated recurrent unit network. Finally, the extracted salient video features are combined to predict future user FoVs. The experimental results show that the performance of the proposed method is better than other prediction methods.

show abstract

Section: B Trajectory-based Prediction Methodsmentioning

confidence: 99%

Section: Fovmentioning

confidence: 99%

Section: A Experimental Setupmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Spherical Convolution empowered FoV Prediction in 360-degree Video Multicast with Limited FoV Feedback

Li¹,

Liu²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In [18] Chen et al proposed Sparkle, a model tailored to predict the exploration patterns of individual users in a 360 • video. This model was evaluated against models based on Logistic Regression and the models from [10] and [16], which were found in [14] to be outperformed by baselines not modeling motion at all.…”

Section: Related Workmentioning

confidence: 99%

HeMoG: A White-Box Model to Unveil the Connection between Saliency Information and Human Head Motion in Virtual Reality

Rondon

Zanca

Melacci

et al. 2021

2021 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)

View full text Add to dashboard Cite

Immersive environments such as Virtual Reality (VR) are now a main area of interactive digital entertainment. The challenge to design personalized interactive VR systems is specifically to guide and adapt to the user's attention. Understanding the connection between the visual content and the human attentional process is therefore key. In this article, we investigate this connection by first proposing a new head motion predictor named HeMoG. HeMoG is a white-box model built on physics of rotational motion and gravitation. Second, we compare HeMoG with existing reference Deep Learning models. We show that HeMoG can achieve similar or better performance and provides insights on the inner workings of these black-box models. Third, we study HeMoG parameters in terms of video categories and prediction horizons to gain knowledge on the connection between visual saliency and the head motion process.

show abstract