With the rapid growth in memory demands, the slowdown of DRAM scaling, and the DRAM price fluctuations, DRAM has become one of the critical resources in cloud computing systems and datacenters. The compressed memory swap (CMS) is a promising technique that improves the effective memory capacity of the underlying computer system by compressing and storing a subset of pages in memory instead of the disk swap. While prior works have extensively investigated resource management techniques for workload consolidation, they lack the capability of dynamically allocating cores, memory, and CMS to the consolidated applications in a controlled and efficient manner. To bridge this gap, this work presents the in-depth characterization of the impact of cores, memory, and CMS on the QoS and throughput of the consolidated latency-critical (LC) and batch applications. Guided by the characterization results, we propose COSMOS, a software-based runtime system for coordinated management of cores, memory, and CMS for QoS-aware and efficient workload consolidation for memoryintensive applications. COSMOS dynamically collects the runtime data from the consolidated applications and the underlying system and allocates the resources to the consolidated applications in a way that achieves high throughput with strong QoS guarantees. Our quantitative evaluation based on a real system and widelyused memory-intensive benchmarks demonstrates the effectiveness of COSMOS in that it robustly satisfies the QoS and achieves high throughput across all the evaluated workload mixes and scenarios and significantly reduces the number of explored system states.INDEX TERMS Cloud and datacenter computing, compressed memory swap, efficiency, quality-of-service, resource management, workload consolidation.