Resource sharing systems describe situations in which users compete for service from scarce resources. Examples include check-in lines at airports, waiting rooms in hospitals or queues in contact centers, data buffers in wireless networks, and delayed service in cloud data centers. These are all situations with jobs (clients, patients, tasks) and servers (agents, beds, processors) that have large capacity levels, ranging from the order of tens (checkouts) to thousands (processors). This survey investigates how to design such systems to exploit resource pooling and economies-of-scale. In particular, we review the mathematics behind the Quality-and-Efficiency Driven (QED) regime, which lets the system operate close to full utilization, while the number of servers grows simultaneously large and delays remain manageable. We also discuss emerging research directions related to load balancing, overdispersion and model uncertainty.arXiv:1706.05397v1 [math.PR]