The total time to execute an application, the energy consumed, and the flexibility to manage a large set of applications are among the most important performance parameters used to measure the quality of a computing system. Superior architectures with flexible reconfigurable arrays lead to innovation beyond the limits of traditional silicon. The incorporation of on-chip reconfigurable computing elements generally improves execution time. However, the amount of energy consumed to deliver the required level of performance is an important consideration, to prolong the battery life in portable and mobile devices. In this paper, we have proposed and designed a novel scalable array architecture and explored the performance and energy trade-offs for various applications by scaling various system parameters like hardware resources, operational granularity, and voltage supply. The scalable coprocessor design for mapping Discrete Cosine Transform (DCT) is implemented with 8 taps resulting in an area of 0.0024ÑÑ ¾ at ¼ ½ technology. The coprocessor to run 16 taps of convolution function results in an area of 0.0099ÑÑ ¾ , while a 256 tap convolution function is designed at an area cost of 0.1585ÑÑ ¾ . When the MPEG decode application is executed in the proposed architecture, with the DCT function computed in the scalable coprocessor, the total execution time is reduced to around 24%, and the energy consumed is reduced to around 28% of that consumed in the base architecture without a coprocessor. Further, as the coprocessor's supply voltage is scaled down from 1.8 to 1.0 volts at ¼ ½ technology, the relative total execution time varied only slightly (from 23.65% to 24.78%), while resulting in considerable reduction in the energy consumed (from 28.12% to 23.8%). For the FIR application, energy consumption reduced up to 36% when hardware resources are scaled and up to another 12% when voltage is scaled, while execution time reduced up to 50% when hardware resources are scaled and increased up to 15% when voltage is scaled. The study also reveals interesting performance patterns for various applications ike CJPEG, MPEG decode/encode, FIR, and IIR, depending on the their characteristics.