2017
DOI: 10.1186/s12859-017-1747-0
|View full text |Cite
|
Sign up to set email alerts
|

Investigating reproducibility and tracking provenance – A genomic workflow case study

Abstract: BackgroundComputational bioinformatics workflows are extensively used to analyse genomics data, with different approaches available to support implementation and execution of these workflows. Reproducibility is one of the core principles for any scientific workflow and remains a challenge, which is not fully addressed. This is due to incomplete understanding of reproducibility requirements and assumptions of workflow definition approaches. Provenance information should be tracked and used to capture all these … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
64
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 64 publications
(64 citation statements)
references
References 48 publications
0
64
0
Order By: Relevance
“…Repeatability is defined as obtaining the same results after re-running the same process on the same set of samples, while reproducibility refers to the ability to obtain similar results on a different set of samples [7]. Assessing repeatability and reproducibility is among the cornerstones of good scientific conduct and is being adopted in many areas of high-throughput experiments such as clinical genomics [8]. Studies have assessed the reproducibility of the microbiome profile as part of MQCP [2].…”
Section: Introductionmentioning
confidence: 99%
“…Repeatability is defined as obtaining the same results after re-running the same process on the same set of samples, while reproducibility refers to the ability to obtain similar results on a different set of samples [7]. Assessing repeatability and reproducibility is among the cornerstones of good scientific conduct and is being adopted in many areas of high-throughput experiments such as clinical genomics [8]. Studies have assessed the reproducibility of the microbiome profile as part of MQCP [2].…”
Section: Introductionmentioning
confidence: 99%
“…Jupyter notebooks are interactive documents that integrate text, code, and analysis results (Kluyver et al, 2016). A major issue for genomic analyses today is how to clearly explain the computational methods used in order to allow for reproducibility (Kanwal et al, 2017). This open access pipeline is intended to provide an example template to improve reproducibility in future studies and function as an instructional tool for biologists and early career scientists who wish to apply these methods to their own study organisms.…”
Section: Introductionmentioning
confidence: 99%
“…This workflow is not merely a reproducibility validation tool, it is an attempt to make research product more reusable by the community using an online platform, beyond the publication process. Such system could be seen as a generalisation of already existing workflow systems such as Galaxy or GATK, integrating data provenance [39,40]. Some top-down initiatives already provide some incentives for such a process i.e.…”
Section: Conclusion and Perspectivementioning
confidence: 99%