The ever increasing number of processing units integrated on the same many-core chip delivers computational power that can exceed the performance requirements of a single application. The number of chips (and related power consumption) can thus be reduced to serve multiple applications — a practice which is called resource consolidation. However, this solution requires techniques to partition and assign resources among the applications and to manage unpredictable dynamic workloads.\ud To provide the performance requirements in such scenarios, we exploit application auto-tuning, based on design-time analysis, of both application-specific dynamic knobs and computational parallelism. Such features are implemented in a software library, which is used to demonstrate the main contribution of this paper: a light-weight Run-Time Resource Management — RTRM — technique to improve resource sharing for computationally intensive OpenCL applications.\ud We evaluate how much the interaction between RTRM and application auto-tuning can become synergistic yet orthogonal. In the proposed approach, run-time adaptation decisions are taken by each application, autonomously. This has two main advantages: i) a non-invasive application design, in terms of source code, and ii) a very low run-time overhead, since it does not require any central coordination of a supervisor nor communication between the applications.\ud We carried out an experimental campaign by using a video processing application — an OpenCL stereo-matching implemen- tation — and stressing out resource usage. We proved that, while RTRM is necessary to provide lower variance of the application performance, the application auto-tuning layer is fundamental to trade it off with respect to the computation accuracy
Open Computing Language (OpenCL) is emerging as a standard for parallel programming of heterogeneous hardware accelerators. With respect to device specific languages, OpenCL enables application portability but does not guarantee performance portability, eventually requiring additional tuning of the implementation to a specific platform or to unpredictable dynamic workloads. In this paper, we present a methodology to analyze the customization space of an OpenCL application in order to improve performance portability and to support dynamic adaptation. We formulate our case study by implementing an OpenCL image stereo-matching application (which computes the relative depth of objects from a pair of stereo images) customized to the STMicroelectronics Platform 2012 many-core computing fabric. In particular, we use design space exploration techniques to generate a set of operating points that represent specific configurations of the parameters allowing different trade-offs between performance and accuracy of the algorithm itself. These points give detailed knowledge about the interaction between the application parameters, the underlying architecture and the performance of the system; they could also be used by a run-time manager software layer to meet dynamic Quality-of-Service (QoS) constraints. To analyze the customization space, we use cycle-accurate simulations for the target architecture. Since the profiling phase of each configuration takes a long simulation time, we designed our methodology to reduce the overall number of simulations by exploiting some important features of the application parameters; our analysis also enables the identification of the parameters that could be explored on a high-level simulation model to reduce the simulation time. The resulting methodology is one order of magnitude more efficient than an exhaustive exploration and, given its randomized nature, it increases the probability to avoid sub-optimal trade-offs
To support adaptivity of data parallel applications on multi-core platforms, we propose a framework based on the combination of OpenCL application auto-tuning and runtime resource management. The framework addresses computationally intensive multimedia OpenCL applications. For these target applications, we show that application auto-tuning, based on design-time analysis, can become synergistic with run-time resource management. In the proposed framework, run-time decisions are taken by each application, autonomously, to achieve system adaptivity. This paper describes the methodology and related toolchain, defined during the 2PARMA European project, based on the integration of independent tools to provide effective compilation of OpenCL code, multi-objective design space exploration, application monitoring and tuning and system-wide runtime resource management. Experimental results are reported for design optimization of an OpenCL stereo-matching application and then for a resource contention scenario where multiple stereomatching applications are executed on the same platform with different run-time requirements.
To better exploit the capabilities offered by multi-core high-end embedded systems, new parallel programming paradigms, such as OpenCL, combined with effective resource management should be adopted. However, dealing with mixed workloads and time varying scenarios is still an open problem. This paper addresses such challenges by exploiting the synergy between Design Space Exploration and Run-Time Resource Management to achieve effective and flexible system-wide application adaptivity. The proposed approach and related toolset have been validated on a multi-core NUMA platform, showing significant improvements in terms of QoS and resource utilization compared to conventional application-level optimization strategies
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.