2022
DOI: 10.1108/jd-08-2021-0167
|View full text |Cite
|
Sign up to set email alerts
|

Measuring the time spent on data curation

Abstract: PurposeBudgeting data curation tasks in research projects is difficult. In this paper, we investigate the time spent on data curation, more specifically on cleaning and documenting quantitative data for data sharing. We develop recommendations on cost factors in research data management.Design/methodology/approachWe make use of a pilot study conducted at the GESIS Data Archive for the Social Sciences in Germany between December 2016 and September 2017. During this period, data curators at GESIS - Leibniz Insti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…First, the authors might not be aware of such a policy and, hence, do not know how research data should be handled. Second, although researchers know how to manage research data according to the institutional research-data policy, they balk at the effort of storing them in a repository because the curation of research data is related to significant costs (Perry & Netscher, 2022). This is a reasonable strategy in case the nonadherence to such guidelines is not sanctioned.…”
Section: The Role Of Institutional Research-datamanagement Policiesmentioning
confidence: 99%
See 1 more Smart Citation
“…First, the authors might not be aware of such a policy and, hence, do not know how research data should be handled. Second, although researchers know how to manage research data according to the institutional research-data policy, they balk at the effort of storing them in a repository because the curation of research data is related to significant costs (Perry & Netscher, 2022). This is a reasonable strategy in case the nonadherence to such guidelines is not sanctioned.…”
Section: The Role Of Institutional Research-datamanagement Policiesmentioning
confidence: 99%
“…Note that making research data available is related to costs (Perry & Netscher, 2022). Data-curation costs and implementing transparency standards vary depending on—among other things—disciplines, study design, data complexity, and the personal information included in the data (Hensel, 2021).…”
Section: The Costs and Benefits Of Available Research Datamentioning
confidence: 99%
“…It is important to note that making research data available is related to costs (Perry & Netscher, 2022). Data curation costs and implementing transparency standards vary depending on -among others -disciplines, study design, data complexity, and the personal information included in the data (Hensel, 2021).…”
Section: The Costs and Benefits Of Available Research Datamentioning
confidence: 99%
“…First, the authors might not be aware of such a policy and, hence, do not know how research data should be handled. Second, although researchers know how to manage research data according to the institutional research data policy, they balk at the effort of storing them in a repository because the curation of research data is related to significant costs (Perry & Netscher, 2022). This is a reasonable strategy in case the non-adherence to such guidelines is not sanctioned.…”
Section: The Role Of Institutional Research Data Management Policiesmentioning
confidence: 99%
“…This process is crucial in scientific research, data-driven decision-making, and knowledge dissemination [1]- [4]. However, data curation is often a time-intensive, manual process including data gathering, annotation and cross-referencing in a defined data model [5]. The recent successful application of Large Language Models (LLMs) such as GPT3.5 to fields like social sciences [6], education [7], art [8], software [9], health care [10], clinical research [11], and medicine [12] provides a compelling motive to develop good benchmarking exercises to test the capabilities of current LLMs to streamline biological data curation and to track improvements in these capabilities as new LLMs and training strategies develop.…”
Section: Introductionmentioning
confidence: 99%