The performance and behavior of large-scale distributed applications is highly influenced by network properties such as latency, bandwidth, packet loss, and jitter. For instance, an engineer might need to answer questions such as: What is the impact of an increase in network latency in application response time? How does moving a cluster between geographical regions affect application throughput? How network dynamics affects application stability? Answering these questions in a systematic and reproducible way is very hard, given the variability and lack of control over the underlying network. Unfortunately, state-of-the-art network emulation or testbeds scale poorly (i.e., MiniNet), focus exclusively on the control-plane (i.e., CrystalNet) or ignore network dynamics (i.e., EmuLab).Kollaps is a fully distributed network emulator that address these limitations. Kollaps hinges on two key observations. First, from an application's perspective, what matters are the emergent end-to-end properties (e.g., latency, bandwidth, packet loss, and jitter) rather than the internal state of the routers and switches leading to those properties. This premise allows us to build a simpler, dynamically adaptable, emulation model that circumvent maintaining the full network state. Second, this simplified model is maintainable in a fully decentralized way, allowing the emulation to scale with the number of machines for the application.Kollaps is fully decentralized, agnostic of the application language and transport protocol, scales to thousands of processes and is accurate when compared against a bare-metal deployment or state-of-the-art approaches that emulate the full state of the network. We showcase how Kollaps can accurately reproduce results from the literature and predict * Corresponding author † Corresponding author the behaviour of a complex unmodified distributed key-value store (i.e., Cassandra) under different deployments.also arise in the reproducibility crisis currently plaguing the system's community [29,64]. Different results for the same system emerge not only because systems are evaluated in different uncontrollable conditions, but also because research testbeds such as Emulab [46], CloudLab [72], or PlanetLab [35] used to conduct experiments tend to get overloaded right before system conference deadlines [51]. We therefore need tools to systematically assess and reproduce the evaluation of large-scale applications. Notably, similar tools could be used to test and (otherwise) prevent costly incidents due to mis-configurations, as in recent events [4,20]. One approach to systematically evaluate a large-scale distributed system is to resort to simulation, which relies on models that capture the key properties of the target system and environment [25]. Simulation provides full control of the system and environment -achieving full reproducibility -and allows to study the model of the system in a variety of scenarios. However, simulations suffer from several well-known problems. In fact, there is a large gap between the...