Dai Hai Ton That scite author profile

Dai Hai Ton That

5Publications

27Citation Statements Received

58Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Alabama in Huntsville, DePaul University, Versailles Saint-Quentin-en-Yvelines University

Publications

Order By: Most citations

Utilizing Provenance in Reusable Research Objects

et al. 2018

View full text Add to dashboard Cite

Science is conducted collaboratively, often requiring the sharing of knowledge about computational experiments. When experiments include only datasets, they can be shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers (DOIs). An experiment, however, seldom includes only datasets, but more often includes software, its past execution, provenance, and associated documentation. The Research Object has recently emerged as a comprehensive and systematic method for aggregation and identification of diverse elements of computational experiments. While a necessary method, mere aggregation is not sufficient for the sharing of computational experiments. Other users must be able to easily recompute on these shared research objects. Computational provenance is often the key to enable such reuse. In this paper, we show how reusable research objects can utilize provenance to correctly repeat a previous reference execution, to construct a subset of a research object for partial reuse, and to reuse existing contents of a research object for modified reuse. We describe two methods to summarize provenance that aid in understanding the contents and past executions of a research object. The first method obtains a process-view by collapsing low-level system information, and the second method obtains a summary graph by grouping related nodes and edges with the goal to obtain a graph view similar to application workflow. Through detailed experiments, we show the efficacy and efficiency of our algorithms.The minimum use-case for sharing a computational experiment (in the form of a shared research object) involves repeating its original execution and verifying its results. To truly exploit its potential, however, it must support modified reuse. Therefore, the research object must be created and stored not as a simple aggregation of digital content, as previously advocated [2,6], but in a readily-computable form: as a reusable research object. We demonstrate the distinction in two ways.Consider a typical research paper with an analysis based on large amounts of code and data, and assume that the researcher authoring the paper has used the code and data to conduct a number of experiments that produce the paper's target figures and results. The example paper's digital artifacts relating to its experiments may be bundled together in a medium such as a file archive (.tar), compressed file format (.gz), virtual image, or container. A shared research object is free to use any of these mediums. A reusable research object, however, must use a virtual image or container, since it must produce a computational research object that, when downloaded and shared, will guarantee an instantly-executable unit of computation.Also consider the example paper's metadata, which, similar to the metadata in most papers, is interspersed throughout the project's written analysis, and throughout its code and data. The metadata can take many forms, including annotations, version information, and provenance. A shared research object's metadata ...

show abstract

Sciunits: Reusable Research Objects

That

Fils

Yuan

et al. 2017

View full text Add to dashboard Cite

Sciunits: Reusable Research Objects

That¹,

Fils²,

Yuan³

et al. 2017

Preprint

View full text Add to dashboard Cite

Trifl

That¹,

Popa

Zeitouni³

2015

ACM Trans. Spatial Algorithms Syst.

View full text Add to dashboard Cite

Due to several important features, such as high performance, low power consumption, and shock resistance, NAND flash has become a very popular stable storage medium for embedded mobile devices, personal computers, and even enterprise servers. However, the peculiar characteristics of flash memory require redesigning the existing data storage and indexing techniques that were devised for magnetic hard disks. In this article, we propose TRIFL, an efficient and generic TRajectory Index for FLash. TRIFL is designed around the key requirements of trajectory indexing and flash storage. TRIFL is generic in the sense that it is efficient for both simple flash storage devices such as SD cards and more powerful devices such as solid state drives. In addition, TRIFL is supplied with an online self-tuning algorithm that allows adapting the index structure to the workload and the technical specifications of the flash storage device to maximize the index performance. Moreover, TRIFL achieves good performance with relatively low memory requirements, which makes the index appropriate for many application scenarios. The experimental evaluation shows that TRIFL outperforms the representative indexing methods on magnetic disks and flash disks.

show abstract

Advancing Open Science Through Innovative Data System Solutions: The Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP)'s Data Ecosystem

Bugbee

Ramachandran

Maskey

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dai Hai Ton That

Utilizing Provenance in Reusable Research Objects

Sciunits: Reusable Research Objects

Sciunits: Reusable Research Objects

Trifl

Advancing Open Science Through Innovative Data System Solutions: The Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP)'s Data Ecosystem

Contact Info

Product

Resources

About