A classic problem in computational biology is the identi cation of altered subnetworks: subnetworks of an interaction network that contain genes/proteins that are di erentially expressed, highly mutated, or otherwise aberrant compared to other genes/proteins. Numerous methods have been developed to solve this problem under various assumptions, but the statistical properties of these methods are o en unknown. For example, some widely-used methods are reported to output very large subnetworks that are di cult to interpret biologically. In this work, we formulate the identi cation of altered subnetworks as the problem of estimating the parameters of a class of probability distributions which we call the Altered Subset Distribution (ASD). We derive a connection between a popular method, jActiveModules, and the maximum likelihood estimator (MLE) of the ASD. We show that the MLE is statistically biased, explaining the large subnetworks output by jActiveModules. We introduce NetMix, an algorithm that uses Gaussian mixture models to obtain less biased estimates of the parameters of the ASD. We demonstrate that NetMix outperforms existing methods in identifying altered subnetworks on both simulated and real data, including the identi cation of di erentially expressed genes from both microarray and RNA-seq experiments and the identi cation of cancer driver genes in somatic mutation data. Availability: NetMix is available online at h ps://github.com/raphael-group/netmix.
Recent studies suggest that social media usage -while linked to an increased diversity of information and perspectives for users -has exacerbated user polarization on many issues. A popular theory for this phenomenon centers on the concept of " lter bubbles": by automatically recommending content that a user is likely to agree with, social network algorithms create echo chambers of similarly-minded users that would not have arisen otherwise [54]. However, while echo chambers have been observed in real-world networks, the evidence for lter bubbles is largely post-hoc.In this work, we develop a mathematical framework to study the lter bubble theory. We modify the classic Friedkin-Johnsen opinion dynamics model by introducing another actor, the network administrator, who lters content for users by making small changes to the edge weights of a social network (for example, adjusting a news feed algorithm to change the level of interaction between users).On real-world networks from Reddit and Twitter, we show that when the network administrator is incentivized to reduce disagreement among users, even relatively small edge changes can result in the formation of echo chambers in the network and increase user polarization. We theoretically support this observed sensitivity of social networks to outside intervention by analyzing synthetic graphs generated from the stochastic block model. Finally, we show that a slight modi cation to the incentives of the network administrator can mitigate the lter bubble e ect while minimally a ecting the administrator's target objective, user disagreement.
Hypergraphs are used in machine learning to model higher-order relationships in data. While spectral methods for graphs are well-established, spectral theory for hypergraphs remains an active area of research. In this paper, we use random walks to develop a spectral theory for hypergraphs with edge-dependent vertex weights: hypergraphs where every vertex has a weight γ e ( ) for each incident hyperedge e that describes the contribution of to the hyperedge e. We derive a random walk-based hypergraph Laplacian, and bound the mixing time of random walks on such hypergraphs. Moreover, we give conditions under which random walks on such hypergraphs are equivalent to random walks on graphs. As a corollary, we show that current machine learning methods that rely on Laplacians derived from random walks on hypergraphs with edge-independent vertex weights do not utilize higher-order relationships in the data. Finally, we demonstrate the advantages of hypergraphs with edge-dependent vertex weights on ranking applications using real-world datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.