Mixing time of exponential random graphs

Bhamidi, Sreekalyani Shankar; Bresler, Guy; Sly, Allan

doi:10.1214/10-aap740

Cited by 72 publications

(14 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is known that the Markov chain rapidly mixes in the case of regular directed graphs, i.e., graphs in which all vertices have the same in- and out-degrees [16], but it appears to be slowly mixing for some exponential degree distributions [29]. It would be interesting to better understand the mixing time behavior of the chain we proposed for signed directed graphs.…”

Section: Discussionmentioning

confidence: 99%

Assessing statistical significance in causal graphs

et al. 2012

View full text Add to dashboard Cite

BackgroundCausal graphs are an increasingly popular tool for the analysis of biological datasets. In particular, signed causal graphs--directed graphs whose edges additionally have a sign denoting upregulation or downregulation--can be used to model regulatory networks within a cell. Such models allow prediction of downstream effects of regulation of biological entities; conversely, they also enable inference of causative agents behind observed expression changes. However, due to their complex nature, signed causal graph models present special challenges with respect to assessing statistical significance. In this paper we frame and solve two fundamental computational problems that arise in practice when computing appropriate null distributions for hypothesis testing.ResultsFirst, we show how to compute a p-value for agreement between observed and model-predicted classifications of gene transcripts as upregulated, downregulated, or neither. Specifically, how likely are the classifications to agree to the same extent under the null distribution of the observed classification being randomized? This problem, which we call "Ternary Dot Product Distribution" owing to its mathematical form, can be viewed as a generalization of Fisher's exact test to ternary variables. We present two computationally efficient algorithms for computing the Ternary Dot Product Distribution and investigate its combinatorial structure analytically and numerically to establish computational complexity bounds.Second, we develop an algorithm for efficiently performing random sampling of causal graphs. This enables p-value computation under a different, equally important null distribution obtained by randomizing the graph topology but keeping fixed its basic structure: connectedness and the positive and negative in- and out-degrees of each vertex. We provide an algorithm for sampling a graph from this distribution uniformly at random. We also highlight theoretical challenges unique to signed causal graphs; previous work on graph randomization has studied undirected graphs and directed but unsigned graphs.ConclusionWe present algorithmic solutions to two statistical significance questions necessary to apply the causal graph methodology, a powerful tool for biological network analysis. The algorithms we present are both fast and provably correct. Our work may be of independent interest in non-biological contexts as well, as it generalizes mathematical results that have been studied extensively in other fields.

show abstract

Section: Discussionmentioning

confidence: 99%

Assessing statistical significance in causal graphs

et al. 2012

View full text Add to dashboard Cite

show abstract

“…The rate of convergence for this Gibbs procedure was studied by Bhamidi, Bresler, and Sly (2008). There is also some work in progress on exact sampling (Butts 2012).…”

Section: Exponential-family Random Graph Models: Global Network Chmentioning

confidence: 99%

Computational Statistical Methods for Social Network Models

Hunter

Krivitsky

Schweinberger

2012

Journal of Computational and Graphical Statistics

View full text Add to dashboard Cite

We review the broad range of recent statistical work in social network models, with emphasis on computational aspects of these methods. Particular focus is applied to exponential-family random graph models (ERGM) and latent variable models for data on complete networks observed at a single time point, though we also briefly review many methods for incompletely observed networks and networks observed at multiple time points. Although we mention far more modeling techniques than we can possibly cover in depth, we provide numerous citations to current literature. We illustrate several of the methods on a small, well-known network dataset, Sampson’s monks, providing code where possible so that these analyses may be duplicated.

show abstract

“…For the case where the starting point is far from MLE, the convergence of these approaches is rather poor. Bhamidi et al [15] give a theoretical explanation: if the parameters are non-negative, then for large n, either the p β model is essentially the same as an ErdosRenyi model or the Markov chain takes exponential time to mix. This limits the application of MCMC-based approach to large networks.…”

Section: B Monte Carlo Based Approachmentioning

confidence: 99%

Estimation of exponential random graph models for large social networks via graph limits

Tian

2013

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

View full text Add to dashboard Cite

Analyzing and modeling network data have become increasingly important in a wide range of scientific fields. Among popular models, exponential random graph models (ERGM) have been developed to study these complex networks. For large networks, however, maximum likelihood estimation (MLE) of parameters in these models can be very difficult, due to the unknown normalizing constant. Alternative strategies based on Markov chain Monte Carlo draw samples to approximate the likelihood, which is then maximized to obtain the MLE. These strategies have poor convergence due to model degeneracy issues. Chatterjee and Diaconis [1] propose a new theoretical framework for estimating the parameters of ERGM by approximating the normalizing constant using the emerging tools in graph theorygraph limits. In this paper, we construct a complete computational procedure built upon their results with practical innovations. More specifically, we evaluate the likelihood via simple function approximation of the corresponding ERGM's graph limit and iteratively maximize the likelihood to obtain the MLE. We also propose a new matching method to find a starting point for our iterative algorithm. Through simulation study and real data analysis of two large social networks, we show that our new method outperforms the MCMC-based method, especially when the network size is large (more than 100 nodes).

show abstract

Mixing time of exponential random graphs

Cited by 72 publications

References 22 publications

Assessing statistical significance in causal graphs

Assessing statistical significance in causal graphs

Computational Statistical Methods for Social Network Models

Estimation of exponential random graph models for large social networks via graph limits

Contact Info

Product

Resources

About