2017
DOI: 10.1515/itit-2016-0031
|View full text |Cite
|
Sign up to set email alerts
|

Opinion paper: Data provenance challenges in biomedical research

Abstract: In this opinion paper we provide an overview of some challenges concerning data provenance in biomedical research. We reflect current literature and depict some examples of existing implicit or explicit provenance aspects in some standard data types in translational research. Furthermore, we assess the need of further data provenance standardization in biomedical informatics. Basic data provenance should provide a recall about the origin of the data, transformation process steps, support replication and presen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
1
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…This can be used to identify potentially invalid processing steps, data quality degradation, or limitations for secondary use [ 3 , 4 , 6 ]. Terms such as data lineage and data pedigree can have slightly different meanings in some of the literature (eg, pedigree is sometimes understood as also capturing information about the quality or trustworthiness of data sources [ 3 , 4 ]) but are often also used interchangeably with provenance (eg, the studies by Simmhan et al [ 6 ] and Baum et al [ 7 ]), which is the approach we follow in this paper.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This can be used to identify potentially invalid processing steps, data quality degradation, or limitations for secondary use [ 3 , 4 , 6 ]. Terms such as data lineage and data pedigree can have slightly different meanings in some of the literature (eg, pedigree is sometimes understood as also capturing information about the quality or trustworthiness of data sources [ 3 , 4 ]) but are often also used interchangeably with provenance (eg, the studies by Simmhan et al [ 6 ] and Baum et al [ 7 ]), which is the approach we follow in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…Although data provenance tracking is a common practice in some disciplines, such as physics, geoscience, geography (particularly in geographic information systems), material science, hydrologic science, and environmental modeling [ 15 - 19 ], it has yet to be widely adopted in many other data-driven research disciplines, including biomedical research [ 7 ]. Consequently, previous reviews either focused on provenance outside the biomedical context (eg, the studies by Simmhan et al [ 6 ] and Herschel et al [ 3 ]) or studied a larger spectrum of data generation and preparation activities of which provenance is just one aspect (eg, the study by de Lusignan et al [ 4 ]).…”
Section: Introductionmentioning
confidence: 99%
“…Because data collection can occur over years, obtaining accurate data provenance is a key challenge. [7] This is especially true for large multisite collaborations. Samples within SEEK are assumed to be immutable (ie, deposited only once), which does not allow for the ongoing collection and updating of information in a simple, coherent, and organized way.…”
Section: Introductionmentioning
confidence: 99%
“…While systems for automated workflows and provenance capture have gained traction in specialised domains such as bioinformatics, the use, or indeed recognition of the need for provenance more generally, such as in the biomedical field as a whole remains "quite low" [4].…”
Section: Introductionmentioning
confidence: 99%
“…http://www.orchid.ac.uk/ 3 https://github.com/taverna/taverna-prov4 "PROV support" https://github.com/VisTrails/VisTrails/issues/1075 5 https://www.w3.org/TR/2013/REC-prov-o-20130430/#Entity…”
mentioning
confidence: 99%