Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference 2013
DOI: 10.1145/2494621.2494625
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous, failure-resilient orchestration of distributed discrete event simulations

Abstract: Discrete event simulations model the behavior of complex, real-world systems. Simulating a wide range of relevant events and conditions naturally provides a more accurate model, but also increases the computational workload associated with the simulation. To manage these processing requirements in a scalable manner, a discrete event simulation can be distributed across a number of computing resources. However, individual tasks in the simulation are stateful, and therefore require inter-task communication and s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…Each modification of the input parameters requires a new set of iterations to be run. Dividing the target simulation into several units and executing them in parallel is one way to improve overall execution times [5,6], but generally does not enable real-time exploration. In this work, we target real-time computational guarantees that involve providing subsecond, interactive responses to the user as simulation parameters are changed.…”
Section: Introductionmentioning
confidence: 99%
“…Each modification of the input parameters requires a new set of iterations to be run. Dividing the target simulation into several units and executing them in parallel is one way to improve overall execution times [5,6], but generally does not enable real-time exploration. In this work, we target real-time computational guarantees that involve providing subsecond, interactive responses to the user as simulation parameters are changed.…”
Section: Introductionmentioning
confidence: 99%
“…Authors' addresses: Z. Sui, M. Malensek, andS. Pallickara, Computer Science Department, Colorado State University, 1873 Campus Delivery, Fort Collins, CO 80523-1873, USA; Harvey, Department of Computing and Information Science, University of Guelph, Guelph, Ontario N1G 2W1, Canada.…”
Section: Introductionmentioning
confidence: 99%
“…Ultimately, each of these items can have severe and unexpected performance consequences in a distributed setting and must be accounted for to ensure that resources are used efficiently. Our prior research [Malensek et al 2013], which focused on autonomous fault tolerance functionality, has been extended in this article to target the two remaining aspects of resource uncertainty. Resource slowdowns may occur due to an increase in the number of processes executing concurrently at the resource, load spikes, or runaway processes resulting from a coding error.…”
Section: Introductionmentioning
confidence: 99%