We discuss new theoretical and experimental results on the dynamic rate shaping (DRS) approach for transcoding compressed video bitstreams (MPEG-1, MPEG-2, MPEG-4, H.261, as well as JPEG). We analyze the behavior of DRS assuming a first order autoregressive source. We propose a set of low complexity algorithms for both constrained and unconstrained DRS and substantiate the almost-optimal experimental performance of the memoryless algorithm by assuming a first order autoregressive source. By deriving the statistical and rate-distortion characteristics of different components of the interframe rate shaping problem, we offer an explanation as to why the set of optimal breakpoint values for any frame is somewhat invariant to the accumulated motion compensated shaping error from past frames. We also present an extensive experimental study on the various DRS algorithms (causally optimal, memoryless, and rate-based) both in their constrained and generalized forms. The study proves the computational viability of the DRS approach to transcoding and identifies a range of rate shaping ratios for which it is better than requantization, both complexity-wise as well as in performance. This result is significant in that it opens up the way to construct much simpler memoryless algorithms that give minimal penalty in achieved quality, not just for this but possibly other types of algorithms. This is also the very first use of matrix perturbation theory for tracking the spectral behavior of the autocorrelation matrix of the source signal and the motion residual it yields.
MPEG-4 is the first visual coding standard that allows coding of scenes as a collection of individual audio-visual objects. We present mathematical formulations for modeling object-based scalability and some functionalities that it brings with it. Our goal is to study algorithms that aid in semi-automating the authoring and subsequent selective addition/dropping of objects from a scene to provide content scalability. We start with a simplistic model for object-based scalability using the "knapsack problem"--a problem for which the optimal object set can be found using known schemes such as dynamic programming, the branch and bound method and approximation algorithms. The above formulation is then generalized to model authoring or multiplexing of scalable objects (e.g., objects encoded at various target bit-rates) using the "multiple choice knapsack problem." We relate this model to several problems that arise in video coding, the most prominent of these being the bit allocation problem. Unlike previous approaches to solve the operational bit allocation problem using Lagrangean relaxation, we discuss an algorithm that solves linear programming (LP) relaxation of this problem. We show that for this problem the duality gap for Lagrange and LP relaxations is exactly the same. The LP relaxation is solved using strong duality with dual descent--a procedure that can be completed in "linear" time. We show that there can be at most two fractional variables in the optimal primal solution and therefore this relaxation can be justified for many practical applications. This work reduces problem complexity, guarantees similar performance, is slightly more generic, and provides an alternate LP-duality based proof for earlier work by Shoham and Gersho (1988). In addition, we show how additional constraints may be added to impose inter-dependencies among objects in a presentation and discuss how object aggregation can be exploited in reducing problem complexity. The marginal analysis approach of Fox (1966) is suggested as a method of re-allocation with incremental inputs. It helps in efficiently re-optimizing the allocation when a system has user interactivity, appearing or disappearing objects, time driven events, etc. Finally, we suggest that approximation algorithms for the multiple choice knapsack problem, which can be used to quantify complexity vs. quality tradeoff at the encoder in a tunable and universal way.
We discuss the behavior of the optimal solution to the dynamic rate shaping problem assuming an AR(I) source model. By analyzing the statistical and ratedistortion behavior of the different components of this minimization prohlem, the following key result is mathematically proven: "The set of optimal breakpoint values for any frame is invariant to the accumulated motion wmpensated shaping enor fmm past frames and may be very reasonably approximated using the current frame shaping error alone".
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.