No-reference video quality assessment for user generated content based on deep network and visual perception

Tan, Yaya; Kong, Guangqian; Duan, Xun; Long, Huiyun

doi:10.1117/1.jei.30.5.053026

Cited by 2 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Blue and black numbers in bold represent the best and second best respectively. We take numbers from (Ying et al, 2021;Jiang et al, 2021;You, 2021;Tan et al, 2021;Liao et al, 2022) for the results of the reference methods. Our final method is marked in gray.…”

Section: Results On Lsvq and Lsvq-1080pmentioning

confidence: 99%

“…It shows that exploiting both global and local information can be beneficial for VQA. Recent CNN-Transformer hybrid methods (Jiang et al, 2021;Li et al, 2021;Tan et al, 2021;You, 2021) show the benefit of using Transformer for temporal aggregation on CNN-based frame-level features. Since all these methods use CNN for spatial feature extraction, they suffer from CNN's limitation, i.e., a relatively small spatial receptive field.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

MRET: Multi-resolution transformer for video quality assessment

Zhang

Wang

Milanfar

et al. 2023

Front. Signal Process.

View full text Add to dashboard Cite

No-reference video quality assessment (NR-VQA) for user generated content (UGC) is crucial for understanding and improving visual experience. Unlike video recognition tasks, VQA tasks are sensitive to changes in input resolution. Since large amounts of UGC videos nowadays are 720p or above, the fixed and relatively small input used in conventional NR-VQA methods results in missing high-frequency details for many videos. In this paper, we propose a novel Transformer-based NR-VQA framework that preserves the high-resolution quality information. With the multi-resolution input representation and a novel multi-resolution patch sampling mechanism, our method enables a comprehensive view of both the global video composition and local high-resolution details. The proposed approach can effectively aggregate quality information across different granularities in spatial and temporal dimensions, making the model robust to input resolution variations. Our method achieves state-of-the-art performance on large-scale UGC VQA datasets LSVQ and LSVQ-1080p, and on KoNViD-1k and LIVE-VQC without fine-tuning.

show abstract

Section: Results On Lsvq and Lsvq-1080pmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

MRET: Multi-resolution transformer for video quality assessment

Zhang

Wang

Milanfar

et al. 2023

Front. Signal Process.

View full text Add to dashboard Cite

show abstract

No-reference video quality assessment based on human visual perception

Zhou,

Kong,

Duan

et al. 2024

J. Electron. Imag.

View full text Add to dashboard Cite

Conducting video quality assessment (VQA) for user-generated content (UGC) videos and achieving consistency with subjective quality assessment are highly challenging tasks. We propose a no-reference video quality assessment (NR-VQA) method for UGC scenarios by considering characteristics of human visual perception. To distinguish between varying levels of human attention within different regions of a single frame, we devise a dual-branch network. This network extracts spatial features containing positional information of moving objects from frame-level images. In addition, we employ the temporal pyramid pooling module to effectively integrate temporal features of different scales, enabling the extraction of inter-frame temporal information. To mitigate the time-lag effect in the human visual system, we introduce the temporal pyramid attention module. This module evaluates the significance of individual video frames and simulates the varying attention levels exhibited by humans towards frames. We conducted experiments on the KoNViD-1k, LIVE-VQC, CVD2014, and YouTube-UGC databases. The experimental results demonstrate the superior performance of our proposed method compared to recent NR-VQA techniques in terms of both objective assessment and consistency with subjective assessment.

show abstract

No-reference video quality assessment for user generated content based on deep network and visual perception

Cited by 2 publications

References 0 publications

MRET: Multi-resolution transformer for video quality assessment

MRET: Multi-resolution transformer for video quality assessment

No-reference video quality assessment based on human visual perception

Contact Info

Product

Resources

About