Jean-Marc Vincent scite author profile

In parallel and distributed systems, validation of scheduling heuristics is usually done by simulation on randomly generated synthetic workloads, typically represented by task graphs. Since there is no single generation method that models all possible workloads for scheduling problems, researchers often re-implement the classical generation algorithms or even implement ad hoc ones. A bad choice of generation method can mislead the validation of the algorithm due to biases it can induce. Moreover, different implementations of the same randomized generation method may produce slightly different graphs. These problems can harm the experimental comparison of scheduling algorithms. In order to provide a comparison basis we propose GGen -a unified and standard implementation of classical task graph generation methods used in the scheduling domain. We also provide an in-depth analysis of each generation method, emphasizing important graph properties that may influence scheduling algorithms.

show abstract

Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home

Javadi¹,

Kondo²,

Vincent³

et al. 2011

IEEE Trans. Parallel Distrib. Syst.

100

View full text Add to dashboard Cite

International audienceIn the age of cloud, Grid, P2P, and volunteer distributed computing, large-scale systems with tens of thousands of unreliable hosts are increasingly common. Invariably, these systems are composed of heterogeneous hosts whose individual availability often exhibit different statistical properties (for example stationary versus nonstationary behavior) and fit different models (for example exponential, Weibull, or Pareto probability distributions). In this paper, we describe an effective method for discovering subsets of hosts whose availability have similar statistical properties and can be modeled with similar probability distributions. We apply this method with about 230,000 host availability traces obtained from a real Internet-distributed system, namely SETI@home. We find that about 21 percent of hosts exhibit availability, that is, a truly random process, and that these hosts can often be modeled accurately with a few distinct distributions from different families. We show that our models are useful and accurate in the context of a scheduling problem that deals with resource brokering. We believe that these methods and models are critical for the design of stochastic scheduling algorithms across large systems where host availability is uncertain

show abstract

Mining for statistical models of availability in large-scale distributed systems: An empirical study of SETI@home

Javadi

Kondo²,

Vincent³

et al. 2009

View full text Add to dashboard Cite

A Flexible Checkpoint/Restart Model in Distributed Systems

Bouguerra¹,

Gautier²,

Trystram³

et al. 2010

View full text Add to dashboard Cite

Large scale applications running on new computing platforms with thousands of processors have to face with reliability problems. The failure of a single processor will cause the entire execution to fail. Most existing approaches to guarantee reliable executions are based on fault tolerance mechanisms. Coordinated checkpointing is one of the most popular technique to deal with failures in such platforms. This work presents a new model of coordinated Checkpoint/Restart mechanism for several types of computing platforms. The model is parametrized by the process failure distribution, the cost to save a global consistent state of processes and the number of computational resources. Through mathematical analysis of reliability, we apply this new model to compute the optimal interval between checkpoint times in order to minimize the average completion time. Model independency from the type of the failure law makes it completely flexible. We show that such a model may be used to reduce the checkpoint rate up to 20% in same cases and up to factor 4 the total overhead in same cases. Finally, we report some experiments based on simulations for random failure distributions corresponding to the two most popular laws, namely, the Poisson's process and Weibull's law.

show abstract

Instant Messaging / Presence Protocol Requirements

Day¹,

Aggarwal²,

Mohr³

et al. 2000

View full text Add to dashboard Cite

Presence and Instant Messaging have recently emerged as a new medium of communications over the Internet. Presence is a means for finding, retrieving, and subscribing to changes in the presence information (e.g. "online" or "offline") of other users. Instant messaging is a means for sending small, simple messages that are delivered immediately to online users.Applications of presence and instant messaging currently use independent, non-standard and non-interoperable protocols developed by various vendors. The goal of the Instant Messaging and Presence Protocol (IMPP) Working Group is to define a standard protocol so that independently developed applications of instant messaging and/or presence can interoperate across the Internet. This document defines a minimal set of requirements that IMPP must meet.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.