We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplification, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We propose models and algorithms for weighted graphs. The interpretation (i.e. decompression) of a compressed, weighted graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge is approximated to be the weight of the superedge. The compression problem now consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized.In this paper, we formulate this task as the 'simple weighted graph compression problem'. We then propose a much wider class of tasks under the name of 'generalized weighted graph compression problem'. The generalized task extends the optimization to preserve longer-range connectivities between nodes, not just individual edge weights. We study the properties of these problems and propose a range of algorithms to solve them, with different balances between complexity and quality of the result. We evaluate the problems and algorithms experimentally on real networks. The results indicate that weighted graphs can be compressed efficiently with relatively little compression error.
Abstract-We propose a novel problem to simplify weighted graphs by pruning least important edges from them. Simplified graphs can be used to improve visualization of a network, to extract its main structure, or as a pre-processing step for other data mining algorithms.We define a graph connectivity function based on the best paths between all pairs of nodes. Given the number of edges to be pruned, the problem is then to select a subset of edges that best maintains the overall graph connectivity. Our model is applicable to a wide range of settings, including probabilistic graphs, flow graphs and distance graphs, since the path quality function that is used to find best paths can be defined by the user. We analyze the problem, and give lower bounds for the effect of individual edge removal in the case where the path quality function has a natural recursive property. We then propose a range of algorithms and report on experimental results on real networks derived from public biological databases.The results show that a large fraction of edges can be removed quite fast and with minimal effect on the overall graph connectivity. A rough semantic analysis of the removed edges indicates that few important edges were removed, and that the proposed approach could be a valuable tool in aiding users to view or explore weighted graphs.
Aromatase P450 (P450arom) is the key enzyme for the biosynthesis of estrogen that is essential for the growth of human endometriosis, a pathology characterized by endometrium-like tissue on the peritoneal surfaces of abdominal organs manifest by pelvic pain and infertility. Surgically transplanted autologous uterine tissue to ectopic sites on the peritoneum in mice has been used as an animal model to study endometriosis. Using this mouse model, we evaluated the roles of the P450arom gene and aromatase enzyme activity in the growth of endometriosis represented by ectopic uterine tissues in mice. Endometriosis was induced surgically in the following groups of mice: 1) untreated transgenic mice with disrupted P450arom gene (ArKO); 2) ArKO mice treated with systemic estrogen; 3) untreated wild-type (WT) mice; 4) WT mice treated with estrogen; 5) WT mice treated with the aromatase inhibitor, letrozole; and 6) WT mice treated with letrozole and estrogen. Each group contained eight mice; +/+ littermates of ArKO mice were used as WT controls. Treatment with estrogen increased the size of ectopic uterine tissues in ArKO and WT mice significantly. The ectopic uterine lesions in untreated and estrogen-treated ArKO mice were strikingly smaller than those in untreated and estrogen-treated WT controls, respectively. Systemic treatment of WT mice with letrozole significantly decreased the lesion size in a dose-dependent manner. The addition of estrogen to letrozole treatment increased the ectopic lesion size, although these lesions were significantly smaller than those in mice treated with estrogen only. As tissue controls, the effects of these conditions on normally located (eutopic) uterine tissue were evaluated. The effects of disruption of the P450arom gene and treatments with letrozole and estrogen seemed to be more profound on ectopic tissues, suggesting that ectopic tissues might be more sensitive to estrogen for growth. We conclude that both an intact P450arom gene and the presence of aromatase enzyme activity are essential for the growth of ectopic uterine tissue in a mouse model of endometriosis.
Group C rotaviruses have been identified recently from fecal samples of children with diarrhea in the United States. Using reverse transcriptasepolymerase chain reaction and sequence analysis, we sequenced gene 8s encoding VP7 from two U.S. strains (RI-1 and RI-2), and eight other strains isolated from patients on four continents, and compared these with the sequences of four published strains. The gene 8s of the 14 strains were remarkably conserved in size and in predicted primary and secondary structures. When the sequences of the human VP7s were compared with that of the prototype porcine Cowden strain, six regions were found variable in both deduced primary and predicted secondary structures, four of which were predicted to be hydrophilic and might determine serotype specificity. Gene 8 of the human S-1 strain was further characterized by expression in recombinant baculoviruses. The expressed product was immunogenic but failed to elicit neutralizing antibodies. Our sequence analysis indicates that all the human strains characterized to date belong to a single G genotype, which may constitute a single G serotype, pending further antigenic analysis. Whether the human strains and the Cowden strain are the same serotype remains to be determined.
Abstract. We propose a generic framework and methods for simplification of large networks. The methods can be used to improve the understandability of a given network, to complement user-centric analysis methods, or as a pre-processing step for computationally more complex methods. The approach is path-oriented: edges are pruned while keeping the original quality of best paths between all pairs of nodes (but not necessarily all best paths). The framework is applicable to different kinds of graphs (for instance flow networks and random graphs) and connections can be measured in different ways (for instance by the shortest path, maximum flow, or maximum probability). It has relative neighborhood graphs, spanning trees, and certain Pathfinder graphs as its special cases. We give four algorithmic variants and report on experiments with 60 real biological networks. The simplification methods are part of ongoing projects for intelligent analysis of networked information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.