Part 5: Distributed ComputingInternational audienceIn modern cloud software systems, the complexity arising from feature interaction, geographical distribution, security and configurability requirements increases the likelihood of faults. Additional influencing factors are the impact of different execution environments as well as human operation or configuration errors. Assuming that any non-trivial cloud software system contains faults, robustness testing is needed to ensure that such faults are discovered as early as possible, and that the overall service is resilient and fault tolerant. To this end, fault injection is a means for disrupting the software in ways that uncover bugs and test the fault tolerance mechanisms. In this paper, we discuss how to experimentally assess software dependability in two steps. First, a model of the software is constructed from different runtime observations and configuration information. Second, this model is used to orchestrate fault injection experiments with the running software system in order to quantify dependability attributes such as service availability. We propose the architecture of a fault injection service within the OpenStack project
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.