Abstract:Graph sampling is frequently used to address scalability issues when analyzing large graphs. Many algorithms have been proposed to sample graphs, and the performance of these algorithms has been quantified through metrics based on graph structural properties preserved by the sampling: degree distribution, clustering coefficient, and others. However, a perspective that is missing is the impact of these sampling strategies on the resultant visualizations. In this paper, we present the results of three user studi… Show more
“…Random Sampling Proxy graphs are representatives of larger graphs that are derived through sampling, filtering, or deriving a structural skeleton such as a spanning tree [ENH17, ZHA15, NHEM17, WCA*17, ZZC*17, RC05]. Therefore, these proxy graphs are missing some vertices and/or edges, and their visualizations will not show all the data.…”
This paper proposes a linear‐time repulsive‐force‐calculation algorithm with sub‐linear auxiliary space requirements, achieving an asymptotic improvement over the Barnes‐Hut and Fast Multipole Method force‐calculation algorithms. The algorithm, named random vertex sampling (RVS), achieves its speed by updating a random sample of vertices at each iteration, each with a random sample of repulsive forces. This paper also proposes a combination algorithm that uses RVS to derive an initial layout and then applies Barnes‐Hut to refine the layout. An evaluation of RVS and the combination algorithm compares their speed and quality on 109 graphs against a Barnes‐Hut layout algorithm. The RVS algorithm performs up to 6.1 times faster on the tested graphs while maintaining comparable layout quality. The combination algorithm also performs faster than Barnes‐Hut, but produces layouts that are more symmetric than using RVS alone. Data and code: https://osf.io/nb7m8/
“…Random Sampling Proxy graphs are representatives of larger graphs that are derived through sampling, filtering, or deriving a structural skeleton such as a spanning tree [ENH17, ZHA15, NHEM17, WCA*17, ZZC*17, RC05]. Therefore, these proxy graphs are missing some vertices and/or edges, and their visualizations will not show all the data.…”
This paper proposes a linear‐time repulsive‐force‐calculation algorithm with sub‐linear auxiliary space requirements, achieving an asymptotic improvement over the Barnes‐Hut and Fast Multipole Method force‐calculation algorithms. The algorithm, named random vertex sampling (RVS), achieves its speed by updating a random sample of vertices at each iteration, each with a random sample of repulsive forces. This paper also proposes a combination algorithm that uses RVS to derive an initial layout and then applies Barnes‐Hut to refine the layout. An evaluation of RVS and the combination algorithm compares their speed and quality on 109 graphs against a Barnes‐Hut layout algorithm. The RVS algorithm performs up to 6.1 times faster on the tested graphs while maintaining comparable layout quality. The combination algorithm also performs faster than Barnes‐Hut, but produces layouts that are more symmetric than using RVS alone. Data and code: https://osf.io/nb7m8/
“…Among the studies that are included in our survey, 15 studies use graphs with more than 1,000 nodes [12,24,37,83,88,89,95,124,132,136,138,143,152] and another nine that use graphs with more than 500 nodes [75,84,94,99,109,120,143,146].…”
For decades, researchers in information visualisation and graph drawing have focused on developing techniques for the layout and display of very large and complex networks. Experiments involving human participants have also explored the readability of different styles of layout and representations for such networks. In both bodies of literature, networks are frequently referred to as being 'large' or 'complex', yet these terms are relative. From a human-centred, experiment point-of-view, what constitutes 'large' (for example) depends on several factors, such as data complexity, visual complexity, and the technology used. In this paper, we survey the literature on humancentred experiments to understand how, in practice, different features and characteristics of node-link diagrams affect visual complexity.
“…For example, the algorithms tend to select the nodes with common degrees far more than the nodes with rare degrees to maintain the power law of degree distribution [66]. Human viewers prefer to observe large structures in advance but may ignore small structures when judging whether a sample is visually similar to the original graph [43,75].…”
Sampling is a widely used graph reduction technique to accelerate graph computations and simplify graph visualizations. By comprehensively analyzing the literature on graph sampling, we assume that existing algorithms cannot effectively preserve minority structures that are rare and small in a graph but are very important in graph analysis. In this work, we initially conduct a pilot user study to investigate representative minority structures that are most appealing to human viewers. We then perform an experimental study to evaluate the performance of existing graph sampling algorithms regarding minority structure preservation. Results confirm our assumption and suggest key points for designing a new graph sampling approach named mino-centric graph sampling (MCGS). In this approach, a triangle-based algorithm and a cut-point-based algorithm are proposed to efficiently identify minority structures. A set of importance assessment criteria are designed to guide the preservation of important minority structures. Three optimization objectives are introduced into a greedy strategy to balance the preservation between minority and majority structures and suppress the generation of new minority structures. A series of experiments and case studies are conducted to evaluate the effectiveness of the proposed MCGS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.