2015
DOI: 10.1145/2723872.2723882
|View full text |Cite
|
Sign up to set email alerts
|

An introduction to Docker for reproducible research

Abstract: As computational work becomes more and more integral to many aspects of scientific research, computational reproducibility has become an issue of increasing importance to computer systems researchers and domain scientists alike. Though computational reproducibility seems more straight forward than replicating physical experiments, the complex and rapidly changing nature of computer environments makes being able to reproduce and extend such work a serious challenge. In this paper, I explore common reasons that … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
578
0
5

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 863 publications
(583 citation statements)
references
References 21 publications
(48 reference statements)
0
578
0
5
Order By: Relevance
“…Clear arrangements for the storage and preservation of the code should be made, instructions need to be provided that will allow the code to be compiled and run without issue, and the code should be accompanied by a description of the core functionalities and hard-and software requirements for its use. This in turn means that source code alone is not sufficient: the software environment needs to be described too, including for instance, any linked libraries, any runtime environments or virtual machines, The open source container engine Docker is intended to provide an efficient solution for computational reproducibility (see www.docker.com) [25,26]. 7 Researchers sometimes prefer not to share code because of a lack of complete and clear documentation.…”
Section: Open Source: Sustainable Software For Sustainable Sciencementioning
confidence: 99%
“…Clear arrangements for the storage and preservation of the code should be made, instructions need to be provided that will allow the code to be compiled and run without issue, and the code should be accompanied by a description of the core functionalities and hard-and software requirements for its use. This in turn means that source code alone is not sufficient: the software environment needs to be described too, including for instance, any linked libraries, any runtime environments or virtual machines, The open source container engine Docker is intended to provide an efficient solution for computational reproducibility (see www.docker.com) [25,26]. 7 Researchers sometimes prefer not to share code because of a lack of complete and clear documentation.…”
Section: Open Source: Sustainable Software For Sustainable Sciencementioning
confidence: 99%
“…However, despite this shortage of empirical work, the importance of Docker for academia and industry is rarely doubted in literature. For instance, Boettiger and Cito et al have concurrently proposed that containerization technology may be an important game changer in making systems and software engineering research more reproducible [4], [25]. In industry, Docker is increasingly being used to build next generation Platform-as-a-Service clouds [26].…”
Section: Related Workmentioning
confidence: 99%
“…Given the fast rise in popularity, its ubiquitous nature in industry, and its surrounding claim of enabling reproducibility [4], we study the Docker ecosystem with respect to quality of Dockerfiles and their change and evolution behavior within software repositories. We developed a tool chain that transforms Dockerfiles and their evolution in Git repositories into a relational database model.…”
Section: Introductionmentioning
confidence: 99%
“…A container can be paused, stopped, and restarted, or be removed from the host. While not being intentioned for it, Docker is a means to ensure long term reproducibility of computational research, as demonstrated for example for R (Boettiger, 2015). A Docker image suffices to capture the data, software, and runtime environment in a well-defined manner and facilitates reproducibility.…”
Section: Workpace Preparationmentioning
confidence: 99%