Reconstructing the software environment of an experiment with kameleon

Emeras, Joseph; Bzeznik, Bruno; Richard, Olivier; Georgiou, Yiannis; Ruiz, Cristian

doi:10.1145/2459118.2459134

Cited by 6 publications

(8 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The infrastructure layer controls processes related to launching a virtual machine with the desired storage, whereas the software layer is concerned with process on a running virtual machine. Sanabria et al, [17] presented an approach called Kameleon [18] based on a software appliance generator. The authors defined the term appliance as pre-built software that is combined with just enough operating system (jeOS) and can run on a bare metal (real hardware) or inside a hypervisor.…”

Section: Related Workmentioning

confidence: 99%

Challenges of Large-Scale Biomedical Workflows on the Cloud -- A Case Study on the Need for Reproducibility of Results

Kanwal

Lonie

Sinnott

et al. 2015

2015 IEEE 28th International Symposium on Computer-Based Medical Systems

View full text Add to dashboard Cite

Computational bioinformatics workflows are extensively used to analyse genomics data. With the unprecedented advancements in genomic sequence technology and opportunities for personalized medicines, it is essential that analysis results are repeatable by others, especially when moving into clinical environment. To cope with the complex computational demands of huge biological datasets, a shift to distributed compute resources is unavoidable. A case study was conducted in which three wellestablished bioinformatics analysis groups across Australia were assigned to analyse exome sequence data from a range of patients with a rare condition: disorder of sex development. Initially these groups used their own in-house data processing pipelines, and subsequently used a common bioinformatics workbench based upon Galaxy and offered through the Australia-wide National eResearch Collaboration Tools and Resources (NeCTAR) Research Cloud. This paper describes the experiences in this work and the variability of results. We put forward principles that should be used to ensure reproducibility of scientific results moving forward.

show abstract

Section: Related Workmentioning

confidence: 99%

Challenges of Large-Scale Biomedical Workflows on the Cloud -- A Case Study on the Need for Reproducibility of Results

Kanwal

Lonie

Sinnott

et al. 2015

2015 IEEE 28th International Symposium on Computer-Based Medical Systems

View full text Add to dashboard Cite

show abstract

“…Tools and ideas were presented, such as Pegasus [11] and Kameleon [5]: these tools target reproducible experiments, mainly addressing the problem exploiting virtual machines to reproduce the environment where an experiment is held. The main focus of these research groups keeps being on the virtual machine approach combined with cloud computing, as can be seen in [1] and [10].…”

Section: Infrastructure For Reproducible and Trusted Experimentsmentioning

confidence: 99%

Trusted High-Performance Computing in the Classroom

Burkhart

Guerrera

Maffia

2014

2014 Workshop on Education for High Performance Computing

View full text Add to dashboard Cite

A well-designed high-performance computing (HPC) course not only presents theoretical parallelism concepts but also includes practical work on parallel systems. Today's machine models are diverse and as a consequence multiple programming models exist. The challenge for HPC course lecturers is to decide what to include and what to exclude, respectively. We have experience in teaching HPC in a multi-paradigm style. The practical course parts include message-passing programming using MPI, directive-based shared memory programming using OpenMP, partitioned global address space based programming using Chapel, and domain-specific programming using a high-level framework. If these models are taught in an isolated mode, students would have problems in assessing the strengths and weaknesses of the approaches presented. We propose a projectbased approach which introduces a specific problem to be solved (in our case a stencil computation) and asks for solutions using the programming approaches introduced. Our course has been successfully taught several times but a major problem has always been checking the individual student solutions, especially to decide which performance results reported one can trust. In order to overcome these deficiencies, we have built a pedagogical tool which enhances the trust in students' work. In the paper we present the infrastructure and tools that make student experiments easily reproducible by lecturers. We introduce a taxonomy for general benchmark experiments, describe the distributed architecture of our development and analysis environment, and, as a case study, discuss performance experiments when solving a stencil problem in multiple programming models.

show abstract

“…We extended our previous work Kameleon [10] which is a very simple software appliance generator that enables the construction and exact post reconstruction of a given software appliance from text descriptions. It is targeted to make easier the reconstruction of custom software stacks in HPC, Grid, or Cloudlike environments.…”

Section: Reproducible Software Appliancesmentioning

confidence: 99%

“…This recipe is a higher level description easy to understand and contains some necessary meta-data in form of global variables and steps. For more details [10] 2. The DATA which is used as input of all the procedures described in the recipe.…”

Section: Requirements For a Reproducible Reconstructionmentioning

confidence: 99%

See 1 more Smart Citation

Reproducible Software Appliances for Experimentation

Ruiz

Richard

Emeras

2014

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

Self Cite

View full text Add to dashboard Cite

Abstract. Experiment reproducibility is a milestone of the scientific method. Reproducibility of experiments in computer science would bring several advantages such as code re-usability and technology transfer. The reproducibility problem in computer science has been solved partially, addressing particular class of applications or single machine setups. In this paper we present our approach oriented to setup complex environments for experimentation, environments that require a lot of configuration and the installation of several software packages. The main objective of our approach is to enable the exact and independent reconstruction of a given software environment and the reuse of code. We present a simple and small software appliance generator that helps an experimenter to construct a specific software stack that can be deployed on different available testbeds.

show abstract

Reconstructing the software environment of an experiment with kameleon

Cited by 6 publications

References 8 publications

Challenges of Large-Scale Biomedical Workflows on the Cloud -- A Case Study on the Need for Reproducibility of Results

Challenges of Large-Scale Biomedical Workflows on the Cloud -- A Case Study on the Need for Reproducibility of Results

Trusted High-Performance Computing in the Classroom

Reproducible Software Appliances for Experimentation

Contact Info

Product

Resources

About