The empirical study of large-scale distributed systems often calls for the use of computer simulations as real-world experimentation is too costly or simply infeasible. Computer simulations can also provide results on a much shorter timespan, increasing productivity. Nevertheless, large-scale system simulation can prove to be non-responsive on modern computers, especially when the modeled system has a high level of complexity or when detailed and compute intensive models are used. In order to fully harness the computational power of modern multi-core computer architectures, computer simulations need to execute in a parallel fashion. In this paper we investigate the potential of parallelizing the execution of the Grid Economics Simulator (GES), a Java-based discrete-event simulator that is targeted towards the simulation of distributed systems in general, and economic forms of resource management in grids in particular. We present the design of a parallel continuation-based simulation core that uses a conservative time synchronization protocol. We analyze the performance of the parallel simulation core through synthetic benchmarks. The results of our performance evaluation give a clear insight in the impact of simulation model properties such as event arrival rates, computational workload, remoteness of events, and look-ahead size, on the speedup that can be achieved through parallel execution.
A discrete-event simulator's ability to distribute the execution of a simulation model allows one to deal with the memory limitations of a single computational resource, and thereby increase the scale or level of detail at which models can be studied. In addition, distribution has the potential to reduce the round trip time of a simulation by incorporating multiple computational cores into the simulation's execution. However, such gains can be voided by the overhead that time synchronization protocols introduce. These protocols are required to prevent the occurrence of causality errors during a parallel execution of a simulation. The overhead depends on the protocol, characteristics of the simulation model, and the architecture of the computational resources used. Recently, infrastructure-as-a-service offerings in cloud computing have introduced flexibility in acquiring computational resources on a pay-as-you-go basis. At present, it is unclear to what extent these offerings are suited for the distributed execution of discrete-event simulations, and how the characteristics of different resource types impact the runtime performance of distributed simulations. In this paper we investigate this issue, and assess the performance of different conservative time synchronization protocols on a range of cloud resource types that are currently available on Amazon EC2.
Computer simulations have become an indispensable tool for the empirical study of large-scale systems. The timely simulation of these systems however, is not without its challenges. Simulators have to be able to harness the full computational power of modern architectures through parallel execution and overcome the memory limitations of a single computer. In this paper we investigate techniques for distributed and parallel execution of the Grid Economics Simulator. We present the design of a parallel and distributed simulation core that uses a conservative time synchronization protocol and describe the optimizations we performed to improve the performance of the simulator. We analyze the performance of the distributed simulation setup through two different application scenarios. Our results demonstrate how the presented techniques contribute to attain significant speedups on a distributed system consisting of multi-core machines and commodity networking hardware.
Research into large-scale distributed systems often relies on the use of simulation frameworks in order to bypass the disadvantages of performing experiments on real testbeds. SimGrid is such a framework, that is widely used and mature. However, we have identified a scalability problem in SimGrid's network simulation layer that limits the number of hosts one can incorporate in a simulation. For modeling large-scale systems such as grids this is unfortunate, as the simulation of systems with tens of thousands of hosts is required. This paper describes how we have overcome this limitation through more efficient storage methods for network topology and routing information. It also describes our use of dynamic routing calculations as an alternative to the current SimGrid method which relies on a static routing table. This reduces the memory footprint of the network simulation layer significantly, at the cost of a modest increase in the runtime of the simulation. We evaluate the effect of our approach quantitatively in a number of experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.