The value of graph-based big data can be unlocked by exploring the topology and metrics of the networks they represent, and the computational approaches to this exploration take on many forms. For the use-case of performing global computations over a graph, it is first ingested into a graph processing system from one of many digital representations. Extracting information from graphs involves processing all their elements globally, which can be done with single-machine systems (with varying approaches to hardware usage), distributed systems (either homogeneous or heterogeneous groups of machines) and systems dedicated to high-performance computing (HPC). For these systems focused on processing the bulk of graph elements, common use-cases consist in executing for example algorithms for vertex ranking or community detection, which produce insights on graph structure and relevance of their elements. Many distributed systems (such as , ) and libraries (e.g. , ) have been built to enable these tasks and improve performance. This is achieved with techniques ranging from classic load balancing (often geared to reduce communication overhead) to exploring trade-offs between delaying computation and relaxing accuracy. In this survey we firstly familiarize the reader with common graph datasets and applications in the world of today. We provide an overview of different aspects of the graph processing landscape and describe classes of systems based on a set of dimensions we describe. The dimensions we detail encompass paradigms to express graph processing, different types of systems to use, coordination and communication models in distributed graph processing, partitioning techniques and different definitions related to the potential for a graph to be updated. This survey is aimed at both the experienced software engineer or researcher as well as the graduate student looking for an understanding of the landscape of solutions (and their limitations) for graph processing.
We address the problem of representing dynamic graphs using k 2 -trees. The k 2 -tree data structure is one of the succinct data structures proposed for representing static graphs, and binary relations in general. It relies on compact representations of bit vectors. Hence, by relying on compact representations of dynamic bit vectors, we can also represent dynamic graphs. In this paper we follow instead the ideas by Munro et al., and we present an alternative implementation for representing dynamic graphs using k 2 -trees. Our experimental results show that this new implementation is competitive in practice.
In eScience, where vast data collections are processed in scientific workflows, new risks and challenges are emerging. Those challenges are changing the eScience paradigm, mainly regarding digital preservation and scientific workflows. To address specific concerns with data management in these scenarios, the concept of the Data Management Plan was established, serving as a tool for enabling digital preservation in eScience research projects. We claim risk management can be jointly used with a Data Management Plan, so new risks and challenges can be easily tackled. Therefore, we propose an analysis process for eScience projects using a Data Management Plan and ISO 31000 in order to create a Risk Management Plan that can complement the Data Management Plan. The motivation, requirements and validation of this proposal are explored in the MetaGen-FRAME project, focused in Metagenomics.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.