2022
DOI: 10.1038/s41597-022-01143-6
|View full text |Cite
|
Sign up to set email alerts
|

A large-scale study on research code quality and execution

Abstract: This article presents a study on the quality and execution of research code from publicly-available replication datasets at the Harvard Dataverse repository. Research code is typically created by a group of scientists and published together with academic papers to facilitate research transparency and reproducibility. For this study, we define ten questions to address aspects impacting research reproducibility and reuse. First, we retrieve and analyze more than 2000 replication datasets with over 9000 unique R … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
73
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 75 publications
(75 citation statements)
references
References 39 publications
2
73
0
Order By: Relevance
“…While the former clearly do not meet the eligibility requirements, we cannot judge the reproducibility, and thus the badge eligibility, of the latter at submission. We make several recommendations for how to improve the specific badge policy at Psychological Science and comparable initiatives at other journals (for further general recommendations on how to improve data sharing and computational reproducibility, see, e.g., Stodden et al, 2016;Trisovic et al, 2022;Wilson et al, 2017). Excellent and more in-depth recommendations and tutorials for authors to ensure that their shared data and code are eligible for an Open Data badge are provided by, for example, Arslan (2019), Eberle ( 2022 2021).…”
Section: Discussionmentioning
confidence: 99%
“…While the former clearly do not meet the eligibility requirements, we cannot judge the reproducibility, and thus the badge eligibility, of the latter at submission. We make several recommendations for how to improve the specific badge policy at Psychological Science and comparable initiatives at other journals (for further general recommendations on how to improve data sharing and computational reproducibility, see, e.g., Stodden et al, 2016;Trisovic et al, 2022;Wilson et al, 2017). Excellent and more in-depth recommendations and tutorials for authors to ensure that their shared data and code are eligible for an Open Data badge are provided by, for example, Arslan (2019), Eberle ( 2022 2021).…”
Section: Discussionmentioning
confidence: 99%
“…However, especially in structural equation modeling research, sharing syntax is essential to evaluate model specifications and decisions, as they are often unavailable upon request (Wicherts & Crompvoets, 2017). In addition, many minor adjustments, such as capturing research workflow or managing project library paths and dependencies, can significantly improve the quality of research code (Trisovic et al, 2022). On the other hand, it is up to journals to ensure that authors with an open data statement in their manuscript share their (intelligible) data.…”
Section: Discussionmentioning
confidence: 99%
“…First, the literature review indicates that the source code is often unavailable, making it impossible to reproduce the study accurately. Second, the algorithms are published in pseudo-code, but its informality risks overlooked errors or new errors being introduced when translated to real programming languages [13,14]. Additionally, they were tested on a small number of datasets [15][16][17].…”
Section: Motivation and Significancementioning
confidence: 99%