Abstract-Energy minimization of parallel applications is an emerging challenge for current and future generations of manycore computing systems. In this paper, we propose a novel and scalable energy minimization approach that suitably applies DVFS in the sequential part and jointly considers DVFS and dynamic core allocations in the parallel part. Fundamental to this approach is an iterative learning based control algorithm that adapt the voltage/frequency scaling and core allocations dynamically based on workload predictions and is guided by the CPU performance counters at regular intervals. The adaptation is facilitated through performance annotations in the application codes, defined in a modified OpenMP runtime library. The proposed approach is validated on an Intel Xeon E5-2630 platform with up to 24 CPUs running NAS parallel benchmark applications. We show that our proposed approach can effectively adapt to different architecture and core allocations and minimize energy consumption by up to 17% compared to the existing approaches for a given performance requirement.Keywords-Many-core, OpenMP, Energy minimization.I. INTRODUCTION Silicon technology scaling has enabled the fabrication of many interconnected cores on a single chip for current and future generations of computing systems. The emergence of such systems has facilitated computing performance at unprecedented levels with application parallelization and architectural support. However, higher device-level integration and operating frequency in these systems have rendered exponentially increased power density and energy consumption [1]. Hence, minimizing the energy consumption, while delivering the required performance is a key design challenge for many-core applications [2], [5].The continuing performance growth of many-core applications has been facilitated by parallel programming models. OpenMP is one such programming model, considered as the de facto standard of shared memory multiprocessing [3]. It features compiler-enabled annotations that can achieve data-or task-level parallelization. The parallelization is facilitated by runtime libraries that can allocate the number of computing nodes and suitably schedule the parallel tasks and threads during runtime to achieve high performance [9].Over the years, there has been growing interest in OpenMP-based dynamic adaptation between energy and performance trade-offs of parallel applications [4]. Dynamic voltage/frequency scaling (DVFS) is a major runtime control knob to achieve such adaptation. The main principle of DVFS is to suitably lower the operating voltage/frequency to exponentially reduce the energy consumption at the cost of linear performance degradation [5]. Shirako et al.[10] proposed a DVFS-based energy minimization approach using a modified OpenMP compiler, named OSCAR. The compiler analyses the criticality of various parallel tasks and sections and identifies suitable DVFS for them. Cochran et al. [11] proposed another adaptive DVFS approach to control the peak power budget of an applicati...