We propose software designs that perform incremental computation with monotonic distortion reduction for twodimensional convolution and frame-by-frame block-matching tasks. In order to reduce the run time of the proposed designs, we combine bitplane-based computation with a packing technique proposed recently. In the case of block matching, we also utilize previously-computed motion vectors to perform localized search when incrementing the precision of the input video frames. The applicability of the proposed approach is demonstrated by execution time measurements on the xo-laptop ("100$ laptop") and on a mainstream laptop; our software is also made available online. In comparison to the conventional (non-incremental) software realization, the proposed approach leads to scalable computation per input frame while producing identical (or comparable) precision for the output results of each operating point. In addition, the execution of the proposed designs can be arbitrarily terminated for each frame with the output being available at the already-computed precision.Index Terms-complexity-scalable image processing, incremental refinement of computation, programmable processors
INTRODUCTIONSeveral popular applications, such as media players, image and video post-processing, and motion estimation and compensation, are being implemented today via software solutions in general-purpose processors. New generations of processors are increasingly powerful and enable more dedicated resource allocation to such real-time multimedia tasks due to multicore designs [1]. At the same time, new generations of software compilers now automatically generate platform-specific optimized assembly code from C++ code [2], thereby enabling platform-independent C++ software solutions to achieve high processor utilization factors. Existing algorithmic-oriented research focuses on complexity reduction [3]- [5] or complexity scalability for image processing tasks [6]-[8], where computational complexity is decreased and approximate results are produced. Implementation-oriented research focuses on multimediadriven energy scaling of processors via dynamic voltage scaling [9] [10] in an attempt to provide computational scalability with approximate results. However, previous approaches can only obtain one operational point in the complexity-distortion curve [3] [7], without being able to increment the quality of the output with increased computation. In addition, in practical image and video coding systems, complexity does not scale down significantly with decreased source precision (decreased bitrate) [7]. Finally, existing image processing realizations produce an "all or nothing" output: one cannot interrupt the computation when system resources suddenly become unavailable (or when delay constraints are about to be violated) and retrieve a meaningful approximation of the final result1 . An exception is found in proposals for incremental computation of transforms and salient point detection algorithms [11] [12], where the main principle is:...