ATSal: An Attention Based Architecture for Saliency Prediction in 360$$^\circ $$ Videos

Dahou, Yasser; Tliba, Marouane; McGuinness, Kevin; O’Connor, Noel E.

doi:10.1007/978-3-030-68796-0_22

Cited by 24 publications

(7 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [18,33,34], the authors aimed at improving the saliency detection accuracy near the poles of the 360 sphere by reducing the impact of distortion at the top and bottom of the equirectangular format. Others [35,36] converted the input 360 video into a cubic map to reduce the negative impact of distortion at the poles of the equirectangular and presented an attention architecture to increase the saliency detection accuracy.…”

Section: Saliency Detectionmentioning

confidence: 99%

Enhancing 360 Video Streaming through Salient Content in Head-Mounted Displays

Nguyen

Yan

2023

Sensors

View full text Add to dashboard Cite

Predicting where users will look inside head-mounted displays (HMDs) and fetching only the relevant content is an effective approach for streaming bulky 360 videos over bandwidth-constrained networks. Despite previous efforts, anticipating users’ fast and sudden head movements is still difficult because there is a lack of clear understanding of the unique visual attention in 360 videos that dictates the users’ head movement in HMDs. This in turn reduces the effectiveness of streaming systems and degrades the users’ Quality of Experience. To address this issue, we propose to extract salient cues unique in the 360 video content to capture the attentive behavior of HMD users. Empowered by the newly discovered saliency features, we devise a head-movement prediction algorithm to accurately predict users’ head orientations in the near future. A 360 video streaming framework that takes full advantage of the head movement predictor is proposed to enhance the quality of delivered 360 videos. Practical trace-driven results show that the proposed saliency-based 360 video streaming system reduces the stall duration by 65% and the stall count by 46%, while saving 31% more bandwidth than state-of-the-art approaches.

show abstract

Section: Saliency Detectionmentioning

confidence: 99%

Enhancing 360 Video Streaming through Salient Content in Head-Mounted Displays

Nguyen

Yan

2023

Sensors

View full text Add to dashboard Cite

show abstract

“…However, as VR environments are often dynamic, these models may not be sufficient for certain applications. To address this, some recent works have focused on attention prediction in 360 • videos [3,9,12]. Nevertheless, all these models only take visual stimuli as input, and therefore they do not take into account the potential influence of sound in VR environments [30].…”

Section: Analyzing and Predicting Viewing Behavior In Vrmentioning

confidence: 99%

D-SAV360: A Dataset of Gaze Scanpaths on 360° Ambisonic Videos

Bernal-Berdun,

Martin,

Malpica

et al. 2023

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

Fig. 1: We present D-SAV360, the most extensive dataset of viewing behavior on 360 • ambisonic videos to date. We have collected gaze and head data from 87 different participants viewing 85 dynamic 360 • videos with directional ambisonic sound, leading to a total of 4,609 scanpaths, larger than previously available datasets of comparable scope. We have thoroughly analyzed this gathered data, and provide valuable insights about viewing behavior and the importance of factors such as viewing conditions, gender, or the type of content shown. We additionally discuss potential applications for our dataset, including benchmarking of audiovisual saliency models, scanpath prediction, or stitching quality assessment, among others. Our dataset is available at https://graphics.unizar.es/projects/D-SAV360.

show abstract

“…In order to achieve better visual fidelity, the generative adversarial network is the most popular model, and has been successfully used in many works [19], [24], [20], [29], [30], [31]. Therefore, our method should be compared with three specific, key architectures: conditional GAN [23], HoloGAN [35], and ATSal [9].…”

Section: D-aware View Synthesismentioning

confidence: 99%

See360: Novel Panoramic View Interpolation

Liu

Cani

Siu

2022

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

We present See360, which is a versatile and efficient framework for 360 • panoramic view interpolation using latent space viewpoint estimation. Most of the existing view rendering approaches only focus on indoor or synthetic 3D environments and render new views of small objects. In contrast, we suggest to tackle camera-centered view synthesis as a 2D affine transformation without using point clouds or depth maps, which enables an effective 360 • panoramic scene exploration. Given a pair of reference images, the See360 model learns to render novel views by a proposed novel Multi-Scale Affine Transformer (MSAT), enabling the coarse-to-fine feature rendering. We also propose a Conditional Latent space AutoEncoder (C-LAE) to achieve view interpolation at any arbitrary angle. To show the versatility of our method, we introduce four training datasets, namely Ur-banCity360, Archinterior360, HungHom360 and Lab360, which are collected from indoor and outdoor environments for both real and synthetic rendering. Experimental results show that the proposed method is generic enough to achieve real-time rendering of arbitrary views for all four datasets. In addition, our See360 model can be applied to view synthesis in the wild: with only a short extra training time (approximately 10 mins), and is able to render unknown real-world scenes. The superior performance of See360 opens up a promising direction for camera-centered view rendering and 360 • panoramic view interpolation.

show abstract

ATSal: An Attention Based Architecture for Saliency Prediction in 360$$^\circ $$ Videos

Cited by 24 publications

References 40 publications

Enhancing 360 Video Streaming through Salient Content in Head-Mounted Displays

Enhancing 360 Video Streaming through Salient Content in Head-Mounted Displays

D-SAV360: A Dataset of Gaze Scanpaths on 360° Ambisonic Videos

See360: Novel Panoramic View Interpolation

Contact Info

Product

Resources

About