In 2013, U.S. data centers accounted for 2.2% of the country's total electricity consumption, a gure that is projected to increase rapidly over the next decade. Many important data center workloads in cloud computing are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to optimize power consumption along with increasing performance demands. This paper introduces Hipster, a technique that combines heuristics and reinforcement learning to improve resource e ciency in cloud systems. Hipster explores heterogeneous multi-cores and dynamic voltage and frequency scaling (DVFS) for reducing energy consumption while managing the QoS of the latency-critical workloads. To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform, and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%. Hipster is also e ective in learning and adapting automatically to speci c requirements of new incoming workloads just enough to meet the QoS and optimize resource consumption. In this work, we extend our previous work in several ways. First, we present an analysis of the size of the reward lookup table and an optimization for the table to improve the scalability of our reinforcement learning mechanism. Second, we demonstrate Hipster's capability to adapt to changes in the latency-critical application at runtime and still satisfy QoS guarantees of the new incoming applications. Lastly, we present a deployment methodology for setting up new applications managed by Hipster's runtime system.