Peak Signal-to-Noise Ratio (PSNR) is widely used as a video quality metric or performance indicator. Some studies have indicated that it correlates poorly with subjective quality, whilst others have used it on the basis that it provides a good correlation with subjective data. Existing literature seems to provide conflicting evidence of the accuracy of PSNR as a video quality metric. Based on experimental results, we explain a scenario where PSNR provides a reliable indication of the variation of subjective video quality and scenarios where PSNR is not a reliable video quality metric. We show that PSNR follows a monotonic relationship with subjective quality in the case of full frame rate encoding when the video content and codec are fixed. We provide evidence that PSNR becomes an unreliable and inaccurate quality metric when several videos with different content are jointly assessed. Furthermore, PSNR is inaccurate in measuring video quality of a video content encoded at different frame rates because it is not capable of assessing the perceptual trade-off between the spatial and temporal qualities. Finally, where PSNR is not a reliable video quality metric across different video contents and frame rates, we show that a perceptual video model recently approved by the International Telecommunication Union (ITU) provides quality predictions highly correlating with subjective scores even if different video scenes coded at different frame rates are considered in the test set.
International audienceTraditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student.s t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment
In this paper we describe a subjective quality assessment experiment conducted to measure the impact of temporal artifacts on video quality and characterize the influence of content motion on perceived quality. We examined the human response to jerkiness and jitter by considering different levels of strength, duration and distribution of the temporal impairments. Using videos with high picture quality, we found that for intermediate and high frame rate values video quality was similar independently from the duration of the frame rate decimation. On the other hand, for very low frame rates, overall video quality decreased as the duration of the impairment increased. The results also show that a reduction of the temporal resolution over the entire video does not necessarily lead to a significant loss of quality. Finally, the results of this study do not confirm the traditional thinking of lower-motion content receiving a higher quality than high-motion content for a given frame rate decimation factor. Using several motion descriptors, we observed that for a given sub-optimum frame rate, perceived quality does not necessarily increase with decreasing motion magnitude. More particularly, we found that perceived quality of head-and-shoulders content is severely affected by frame rate decimation although it is characterized by very low motion. Our results suggest that motion magnitude is not the only factor affecting perception of temporal artifacts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.