Abstract-Saliency-driven image coding is well worth pursuing. Previous studies on JPEG and JPEG2000 have suggested that region-of-interest coding brings little overall benefit compared to the standard implementation. We show that our saliencydriven variable quantization JPEG coding method significantly improves perceived image quality. To validate our findings, we performed large crowdsourcing experiments involving several hundred contributors, on 44 representative images. To quantify the level of improvement, we devised an approach to equate Likert-type opinions to bitrate differences. Our saliency-driven coding showed 11% bpp average benefit over the standard JPEG. I. INTRODUCTION AND MOTIVATIONRegion-of-interest (ROI) based image compression techniques propose to compress the background more than the foreground in order to improve the perceived image quality. In his studies on two-level ROI based JPEG2000 image coding Bradley [1] has shown that his strategy did not improve on the standard JPEG2000 overall. It only did so at very low bitrates. Harding et al.[2] proposed a binary "visual interest"-guided JPEG2000 compression technique that increased image quality as measured objectively, but did not account for the fact that images of the same bitrate should be compared. Furthermore, in a recent study on perceptual quality in images Alers et al.[3] have shown that image foreground regions are much more important than the background. In view of these results we reconsider ROI based image compression. To better understand the importance of the ROI in perceived quality, we designed our own saliency-driven variable coding strategy.Due to its simplicity and popularity we decided to base our variable quantization technique on JPEG Part 3 rather than working with JPEG2000. One of the intended purposes of variable quantization is "the ability to use the masking properties of the human visual system more effectively, and thereby achieving greater compression rates for the same subjective quality", see [4]. Variable quantization has already been shown to produce better results than the standard JPEG for special applications. For instance, Konstantinides et al. [5] have adjusted the quantization scaling factors in composite documents. Memon et al. [6] used a measure of block activity and type. None of these works evaluate results in terms of perceptual improvement (user studies). Harding et al. [2] have performed limited subjective studies, however due the low numbers of participants their results are inconclusive. Yu et al. [7] perform subjective evaluation as well using a sequential paired comparison quality assessment methodology. Their results are overall not in favor of their encoding technique.
Vid eo streaming under real-time constraints is an increasingly widespread application. Many recent video eucoders are unsuitablc for lhis sccnario due to theoretical limitations or ruo tim e requ iremcnts. In this paper, we present a framcwork för thc pcrcc1>tual cvaluation of f'oveated video coding schemcs. Foveation describes the process of adapting a visual stimulus according to the acuity of the human eye. In contrast to tradHiooal rcgion-of-interest coding, where certain ar eas arc staticaJly cncodcd at a highcr ttuality, we utilize feedback from an eye-tracker to spatially steer the bit allocation scheme in realtime. We evaluate the performance of an H.264 based foveated coding scheme in a lab environment by compari ng the bitrates at the point of just noticeable d. istortion (J ND). Fur thermore, we identify perceptually optimal codec parameterizations. In our trials, we achieve an ave rage bitrate savings of 63.24% at the J ND in cornparison to thc unfoveated baseline.
The just noticeable difference (JND) is the minimal difference between stimuli that can be detected by a person. The picture-wise just noticeable difference (PJND) for a given reference image and a compression algorithm represents the minimal level of compression that causes noticeable differences in the reconstruction. These differences can only be observed in some specific regions within the image, dubbed as JND-critical regions. Identifying these regions can improve the development of image compression algorithms. Due to the fact that visual perception varies among individuals, determining the PJND values and JND-critical regions for a target population of consumers requires subjective assessment experiments involving a sufficiently large number of observers. In this paper, we propose a novel framework for conducting such experiments using crowdsourcing. By applying this framework, we created a novel PJND dataset, KonJND++, consisting of 300 source images, compressed versions thereof under JPEG or BPG compression, and an average of 43 ratings of PJND and 129 self-reported locations of JND-critical regions for each source image. Our experiments demonstrate the effectiveness and reliability of our proposed framework, which is easy to be adapted for collecting a large-scale dataset. The source code and dataset are available at https://github.com/angchen-dev/LocJND.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.