Data provenance has led to a developing need for the technologies to empower end-users to assess and take action on the data life cycle. In the Big Data era, companies' amount of data over the world increases each day. As data increases, metadata on the data origin and lifecycle of data also overgrows. Thus, this requires innovations that can provide a better understanding and interpretation of data using data provenance. This study addresses the challenge of extracting data in the form of graphs from scientific workflows and facilitating demanded visualization approaches such as graph comparison, summarization, backward-forward querying, and stream data visualization. W3C-PROV-O provenance specification is implemented via a visualization tool to assess the applicability of proposed algorithms. The proposed algorithms are tested on a large-scale provenance dataset to explore their performance. In addition, this study discusses the details of a comprehensive usability study of the prototype visualization tool. Results indicate that proposed visualization approaches are usable and processing overhead is insignificant.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.