Data storage and sharing for the long tail of science

Zhang, Boyu; Pouchard, Line; Smith, Preston; Gasc, Amandine; Pijanowski, Bryan C.

doi:10.1109/nysds.2016.7747811

Cited by 4 publications

(2 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Requirements for reproducibility are numerous and unclear and only started to be explored in details every step of a computational experiment (Carpen-Amarie et al, 2014). Recent recommendations include publishing source code, computational environments, and workflows in trusted repositories with persistent identifiers and links (Zhang et al, 2016a), as well as designing incentives to encourage reproducibility by journals and funding agencies (Stodden et al, 2016).…”

Section: Related Workmentioning

confidence: 99%

Computational reproducibility of scientific workflows at extreme scales

Pouchard

Baldwin

Elsethagen

et al. 2019

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics and performance metrics, in a hybrid queriable system, the ProvEn server. The system capabilities are illustrated on two use cases: scientific reproducibility of results in the ACME climate simulations and performance reproducibility in molecular dynamics workflows on HPC computing platforms.

show abstract

Section: Related Workmentioning

confidence: 99%

Computational reproducibility of scientific workflows at extreme scales

Pouchard

Baldwin

Elsethagen

et al. 2019

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

show abstract

“…Part of the challenge is the long tail of data, where different communities, small research teams and individual experiments have specific requirements on how the data is stored and catalogued [16]. Each community or individual scientist working on small projects produce a large portion of the total scientific output and do not have many resources to make the data accessible to the wider public [35]. With funding agencies requiring data management plans upfront [13,25], storing data on local hard drives is not feasible anymore.…”

Section: Introductionmentioning

confidence: 99%

Clowder

Marini

Gutierrez-Polo

Kooper

et al. 2018

Proceedings of the Practice and Experience on Advanced Research Computing

View full text Add to dashboard Cite

Clowder is an open source data management system to support data curation of long tail data and metadata across multiple research domains and diverse data types. Institutions and labs can install and customize their own instance of the framework on local hardware or on remote cloud computing resources to provide a shared service to distributed communities of researchers. Data can be ingested directly from instruments or manually uploaded by users and then shared with remote collaborators using a web front end. We discuss some of the challenges encountered in designing and developing a system that can be easily adapted to different scientific areas including digital preservation, geoscience, material science, medicine, social science, cultural heritage and the arts. Some of these challenges include support for large amounts of data, horizontal scaling of domain specific preprocessing algorithms, ability to provide new data visualizations in the web browser, a comprehensive Web service API for automatic data ingestion and curation, a suite of social

show abstract

Strategy for Research Data Management Services in Indonesia

Marlina

Purwandari

2019

Procedia Computer Science

View full text Add to dashboard Cite

Data storage and sharing for the long tail of science

Cited by 4 publications

References 32 publications

Computational reproducibility of scientific workflows at extreme scales

Computational reproducibility of scientific workflows at extreme scales

Clowder

Strategy for Research Data Management Services in Indonesia

Contact Info

Product

Resources

About