This paper compares the energy efficiency of chip multiprocessing (CMP) and simultaneous multithreading (SMT) on modern out-of-order processors for the increasingly important multimedia applications. Since performance is an important metric for realtime multimedia applications, we compare configurations at equal performance. We perform this comparison for a large number of performance points derived using different processor architectures and frequencies/voltages. We find that for the design space explored, for each workload, at each performance point, CMP is more energy efficient than SMT. The difference is small for two thread systems, but large (18% to 44%) for four thread systems. We also find that the best SMT and the best CMP configuration for a given performance target have different architecture and frequency/voltage. Therefore, their relative energy efficiency depends on a subtle interplay between various factors such as capacitance, voltage, IPC, frequency, and the level of clock gating, as well as workload features. We perform a detailed analysis considering these factors and develop a mathematical model to explain these results.Although CMP shows a clear energy advantage for four-thread (and higher) workloads, it comes at the cost of increased silicon area. We therefore investigate a hybrid solution where a CMP is built out of SMT cores, and find it to be an effective compromise. Finally, we find that we can reduce energy further for CMP with a straightforward application of previously proposed techniques of adaptive architectures and dynamic voltage/frequency scaling.
The authors present a new technique for estimating the distance of visibility in fog conditions. Based on a psychovisual model and on contrast estimation with wavelet transform, their technique fared well when compared to a direct approach based on local contrast calculation
We present an instruction-level power dissipation model of the Intel XScale R microprocessor. The XScale implements the ARM TM ISA, but uses an aggressive microarchitecture and a SIMD Wireless MMX TM co-processor to speed up execution of multimedia workloads in the embedded domain.Instruction-Level power modelling was first proposed by Tiwari et. al. in 1994. Adaptations of this model have been found to be applicable to simple ARM processors. Research also shows that instructions can be clustered into groups with similar energy characteristics. We adapt these methodologies to the significantly more complex XScale processor.We characterize the processor in terms of the energy costs of opcode execution, operand values, pipeline stalls etc. through accurate measurements on hardware. This instruction-based (rather than microarchitectural) approach allows us to build a high-speed power-accurate simulator that runs at MIPS-range speeds, while achieving accuracy better than 5%.The processor core accounts only for a portion of overall power consumption, and we move beyond the core to explore the issues involved in building a SystemC simulation framework that models power dissipation of complete systems quickly, flexibly and accurately.
In this paper, we address the problem of motion estimation (ME) in digital video sequences and propose a new fast, adaptive, and efficient block-matching algorithm. Higher quality and efficiency are achieved using a statistical model for the motion vectors. This model introduces adaptation in the search window, drastically reducing the number of positions where correlation-type computation is performed. The efficiency is further improved by progressively undersampling the macroblock. Patterns for undersampling are proposed to obtain the maximum benefit from single instruction multiple data (SIMD) instructions. In contrast with existing motion-estimation techniques, search strategy and subsampled patterns are closely linked. This shows that a good search strategy is much more important than blindly reducing the number of pixels considered for the matching pattern. We describe an implementation of the proposed matching strategy that exploits the very long instruction word (VLIW) and SIMD technology available in the new Itanium Processor Family. Results show that the proposed algorithm adapts easily to the evolution of the scene avoiding annoying quality drops that can be observed with other deterministic algorithms. The total number of operations required by the proposed method is inferior to those required by traditional approaches.Index Terms-Adaptive motion estimation, block matching, digital video, MPEG, SIMD, VLIW.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.